Find The Max Points On A Line


In this article, we find the maximum number of points lying on the same straight line.
Problem Statement 
Given n points on a 2D plane, find the maximum number of points that lie on the same straight line.
Example 1
Input: [ [1,1], [2,2], [3,3] ]
Output: 3
Example 2
Input: [ [1,1], [3,2], [5,3], [4,1], [2,3], [1,4] ]
Output: 4


Let's simplify the problem and search the maximum number of points on a line passing through the point i . One can immediately notice that it's interesting to consider only the next points i + 1 .. N - 1 because the maximum number of points containing, for example, the point i - 2 was already found during the search of the maximum number of points on a line passing through the point i - 2. The idea is very simple: draw the lines passing through the point i and each of the next points. Save these lines is a hash table with a counter 2 = two points on this line.
Let's imagine now that points i < i + k < i + l are on the same line. Drawing a line through i and i + l, one would discover that this line is already tracked and hence one has to update a counter of points on this line count++. If the line is horizontal, i.e. y = c , one could use this constant c as a line key in a hash table of horizontal lines. The other lines could be represented as y = slope * x + c. The equation for the line passing through two points 1 and 2 are given here. Hence, a slope value is sufficient to represent a unique line starting from a specific point i.
One might go ahead and use a float (or double) value to represent each unique slope. Indeed, this could work for most of the cases, but not all. One of the cases that a float value would not cut for the slope variable is that when two points form a vertical line. As we can see from the formula to calculate the slope value, we would encounter a divide-by-zero error. One might argument we could treat this as a special case, and assign a special value (say, zero) to represent the horizontal slope. However, a bigger problem is that the float and double values are intrinsically inaccurate, due to how these values are represented in the computer(click here to know the problems in depth).
A simple fact to comprehend this limitation is that we could have infinite number of digits for a fraction number, we could only keep a limited number of digits as its float value in the computer. Therefore, it is not wise to use the float/double value to represent a unique slope, since they are not accurate.
As a reminder, two integers are co-primes, if and only if their greatest common divisor is 1. As one can see, due to the property of co-prime numbers, they can be used to represent the slope values of different lines.We now have the idea and even some important details (co-primes) to implement the algorithm,
  1. Initiate the maximum number of points max_count = 1.
  2. Iterate over all points i from 0 to N - 2 .
  3. For each point i find a maximum number of points max_count_i on a line passing through the point i:
  4. Initiate the maximum number of points on a line passing through the point i : count = 1.
  5. Iterate over next points j from i + 1 to N - 1. If j is a duplicate of i , update a number of duplicates for point i. If not: Save the line passing through the points i and j. Update count at each step.
  6. Return max_count_i = count + duplicates.
  7. Update the result max_count = max(max_count, max_count_i).
Below is a code sample illustrating this algorithm in python 3:
  1. class Solution(object):  
  2.     def maxPoints(self, points):  
  3.         def max_points_on_a_line_containing_point_i(i):  
  4.             def slope_coprime(x1, y1, x2, y2):  
  5.                 delta_x, delta_y = x1 - x2, y1 - y2  
  6.                 if delta_x == 0:  
  7.                     return (00)  
  8.                 elif delta_y == 0:  
  9.                     return (sys.maxsize, sys.maxsize)  
  10.                 elif delta_x < 0:  
  11.                     delta_x, delta_y = - delta_x, - delta_y  
  12.                 gcd = math.gcd(delta_x, delta_y)  
  13.                 slope = (delta_x / gcd, delta_y / gcd)  
  14.                 return slope  
  16.             def add_line(i, j, count, duplicates):  
  17.                 x1 = points[i][0]  
  18.                 y1 = points[i][1]  
  19.                 x2 = points[j][0]  
  20.                 y2 = points[j][1]  
  21.                 if x1 == x2 and y1 == y2:  
  22.                       duplicates += 1  
  23.                 elif y1 == y2:  
  24.                     nonlocal horizontal_lines  
  25.                     horizontal_lines += 1  
  26.                     count = max(horizontal_lines, count)  
  27.                 else:  
  28.                     slope = slope_coprime(x1, y1, x2, y2)  
  29.                     lines[slope] = lines.get(slope, 1) + 1  
  30.                     count = max(lines[slope], count)  
  31.                 return count, duplicates  
  32.             lines, horizontal_lines = {}, 1  
  33.             count = 1  
  34.             duplicates = 0  
  35.             for j in range(i + 1, n):  
  36.                 count, duplicates = add_line(i, j, count, duplicates)  
  37.             return count + duplicates  
  38.         n = len(points)  
  39.         if n < 3:  
  40.             return n  
  41.                 max_count = 1  
  42.         for i in range(n - 1):  
  43.             max_count = max(max_points_on_a_line_containing_point_i(i), max_count)  
  44.         return max_count  
Time complexity : O(N x N)
Space complexity : O(N)