Convex Hull
 
01_intro
20161203

Intro

You have a bunch of points, and you want to know the line that connects the hull around those points. Eg. you want to calculate the minimum length of the wire fence to fence off a field with a number of objects (static sheep? garden gnomes? stuffed cows?) on it.

Solution: this is a convex hull problem (decide which points make up the hull), for which Sedgewick presented the Graham Scan in the MOOC 'Algorithms I' (and the corresponding text books).

For more detail watch Sedgewick's lecture, the Graham scan algorithm explanation starts at around 4m30s: www.youtube.com/watch?v=aSjaCvJ8lzM

Additional information, algs4.cs.princeton.edu/99hull

Briefly explained the algorithm works as follows:

  • choose a starting point: the point with the lowest y coordinate
  • for all the other points calculate the angle they make with the starting point (to be exact: the angle between the lines through these two points and the horizontal line)
  • order the points by angle (ascending)
  • iterate over the points:
    • if the motion from point 1 to 2 to 3 is a clockwise motion, discard point 2, and take a step back
    • otherwise move to the next set of 3 points
02_clockwise
20161205

The clockwise function

As you gathered from the outline of the algorithm in the introduction, we need to write a function to decide whether the motion from point 1 to point 2 to a point 3 is clockwise or counterclockwise or collinear. Let's focus on that first.

Triange area

The area of a triangle can be calculated from following determinant:

It happens to be so that when the area is negative, the motion from point 1 -> 2 -> 3 is a counterclockwise motion. When the calculated area is positive, the motion from point 1 -> 2 -> 3 is a clockwise motion. And when it's zero? Yes, you've guessed it: the points fall on the same line, ie. they are collinear.

.

Example Program

The following program is for inspecting whether the clockwise function works as expected: it draws a number of cases, with p1 always at (50,0), p2 at ( 20..80, 50), and p3 random. Depending on clockwise or counterclockwise, the line is drawn in either red or green, to make it possible to visually inspect the correctness.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#!/usr/bin/python 

import matplotlib.pyplot as plt
import random

def x(tp):
    return tp[0]

def y(tp):
    return tp[1]

# return 'g', 'b','r' respectively if counterclockwise, collinear, or clockwise
def clockwise(pt1, pt2, pt3):
    half_area = (x(pt2)-x(pt1))*(y(pt3)-y(pt1))-(y(pt2)-y(pt1))*(x(pt3)-x(pt1))
    rv='b'              # collinear
    if half_area<0: rv='r'   # clockwise
    elif half_area>0: rv='g' # counter clockwise
    return rv

p1=(50.0,0.0) 
for i in range(7): 
    p2= ( 50+(i-3)*10, 50)
    sgn=random.sample([-1,1],1)[0]
    p3= ( 20+random.randint(1,60), 50+ sgn*(10 + random.randint(1,30)) ) 
    col=clockwise(p1,p2,p3) 
    plt.plot(*zip(*[p1,p2,p3]),  color=col,  linewidth=2.0, marker='o') 

plt.show()

Note: the x() and y() functions are convenience functions to get the x and y element from a point (a tuple with 2 elements), because this formula is easier to read ..

half_area = (x(pt2)-x(pt1))*(y(pt3)-y(pt1))-(y(pt2)-y(pt1))*(x(pt3)-x(pt1))

.. than this one ..

half_area = (pt2[0]-pt1[0])*(pt3[1]-pt1[1])-(pt2[1]-pt1[1])*(pt3[0])-pt1[0])

Example Plot

The following is a plot generated by the above program. When inspecting clockwise or not, start with P1 at the bottom, and choose a line going up to P2, and continue to P3. Is it a clockwise motion? The line should be in red then.

03_core
20161206

My Implementation

Utility functions

  • calc_angle(pt1,pt2): calculate the angle between a line defined by two points and the horizontal
  • dist(pt1,pt2): calculate the Euclidean distance between two points
  • clockwise(): more detail in prior section
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
# convenience functions to get the x,y value from a tuple, 
# and avoid using the wrong tuple index
# (sacrifice a bit of performance for legibility) 
def x(tp):
    return tp[0]

def y(tp):
    return tp[1]

# calculate the angle for a line between two points and the horizontal
def calc_angle(pt1, pt2):
    dy=float( y(pt2) - y(pt1) )
    dx=float( x(pt2) - x(pt1) )
    angle=0.0
    if dx!=0:
        angle=math.atan(dy/dx)
    else:
        if dy==0:
            angle=0.0
        elif dy<0:
            angle=math.atan(float('-inf'))
        else:
            angle=math.atan(float('inf'))
    # if the calculated angle is negative, make it positive
    while (angle<0):
        angle=angle+2*math.pi;
    return angle

# calculate the distance between two points 
def dist(pt1,pt2):
    dx=float( x(pt2) - x(pt1) )
    dy=float( y(pt2) - y(pt1) )
    return math.sqrt( dx*dx + dy*dy)

# does moving from point pts[i0] -> pts[i1] -> i2 describe a counterclockwise motion ? 
def clockwise(pt1, pt2, pt3):
    area = (x(pt2)-x(pt1))*(y(pt3)-y(pt1))-(y(pt2)-y(pt1))*(x(pt3)-x(pt1))
    rv=0               # collinear
    if area<0: rv=-1   # clockwise
    elif area>0: rv=+1 # counter clockwise
    return rv

Main body

Data: the location of the sheep.

1
points= [ (9,2), (5,6), (3,10), (11,13), (7,6), (15,18), (19,16), (13,4), (23,8), (17,12)  ]

Step 1: identify the starting point

1
2
3
4
# identify the point with the lowest y coordinate and put it first in the list of points
# special case: in case of a tie, decide by lowest x coordinate
points=sorted( points, key=lambda a: a[0]) # secondary sort on x  
points=sorted( points, key=lambda a: a[1]) # primary sort on y (python sort is stable) 

Step 2: put the points in the correct order

1
2
3
4
5
6
7
8
9
# enrich the tuple list with the angle between the line through (point, first point) 
# and the horizontal 
points = [ (x(pt),y(pt),calc_angle(points[0],pt)) for pt in points ]

# order the points by angle
points=sorted(points, key=lambda a: a[2])

# after sorting we don't need the angles anymore, let's drop them
points = [ (x(pt),y(pt)) for pt in points ]

Step 3: the core of the Graham scan

  • iterate over the points in ascending angle order
  • check if pt1-> pt2 -> pt3 makes a counterclockwise turn
  • if NOT, then discard pt2
1
2
3
4
5
6
7
8
9
10
hull=copy.deepcopy(points)  # deep copy, because we are going to delete points 
i=0
while True:
    if clockwise(hull[i],hull[i+1],hull[i+2])<=0: # clockwise or collinear
        del hull[i+1]
        i-=1    # take a step back 
    else:
        i+=1    # step forward
        if (i+2)>=len(hull): # are we done?
            break

Epilogue: print the list of points on the hull, calculate the perimeter and plot the points

1
2
3
4
5
6
7
8
9
10
11
12
print "Points on the hull" ,hull

# to close the polygon, (for drawing, and calculating distance) append the first point 
hull.append(hull[0])

print "Perimeter distance: ", sum( [ dist( hull[i],hull[i+1] ) for i in range(len(hull)-1) ] )


# plotting the points and the hull
plt.scatter(*zip(*points),s=20)      # plot all the points
plt.plot(*zip(*hull),linestyle='-')  # plot the hull
plt.savefig("convex_hull.png")
04_run
20161206

Run 1

Input:

points= [ (9,2),(5,6),(3,10),(11,13),(7,6),(15,18),(19,16),(13,4),(23,8),(17,12)  ]

Output:

Points on the hull [(9, 2), (23, 8), (19, 16), (15, 18), (3, 10), (5, 6)]
Perimeter distance:  53.1991493831

Solution: the length of the wire we need to fence off the sheep is: 53.20.

Run 2

Check if the convex hull implementation is correct by taking as input a list of 1000 random points:

Conclusion: yes it's correct, all points fall within or on the convex hull.

Run 3

And finally, let's take as input a list of 1000 random points from a normal distribution:

Alternative implementation

Here is a different implementation of the core Graham scan, the same as implemented by Sedgewick in algs4.cs.princeton.edu/99hull/GrahamScan.java.html.

Because of the stack operations I find this implementation more difficult to understand, but its big advantage is that it doesn't need to copy the whole points list.

1
2
3
4
5
6
7
8
9
hulli=[]
hulli.append(0)
hulli.append(1)
for i in range(2,len(points)):
    top=hulli.pop()
    while clockwise( points[hulli[-1]], points[top], points[i])<=0:
        top=hulli.pop()
    hulli.append(top)
    hulli.append(i)
05_source
20161206
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
#!/usr/bin/python
import math
import matplotlib.pyplot as plt
import copy 
import random      # only needed for random points
import numpy as np # only needed for random points from normal distrib

# convenience functions to get the x,y value from a tuple, 
# and avoid using the wrong tuple index
# (sacrifice a bit of performance for legibility) 
def x(tp): 
    return tp[0]

def y(tp): 
    return tp[1]

# calculate the angle for a line between two points and the horizontal
def calc_angle(pt1, pt2): 
    dy=float( y(pt2) - y(pt1) ) 
    dx=float( x(pt2) - x(pt1) ) 
    angle=0.0
    if dx!=0:
        angle=math.atan(dy/dx)   
    else:
        if dy==0: 
            angle=0.0
        elif dy<0: 
            angle=math.atan(float('-inf'))
        else: 
            angle=math.atan(float('inf'))
    # if the calucated angle is negative, make it positive
    while (angle<0): 
        angle=angle+2*math.pi;
    return angle

# calculate the distance between two points 
def dist(pt1,pt2): 
    dx=float( x(pt2) - x(pt1) )
    dy=float( y(pt2) - y(pt1) )
    return math.sqrt( dx*dx + dy*dy)  

# does moving from point pts0 -> pt1 -> pt2 describe a counterclockwise motion ? 
def clockwise(pt1, pt2, pt3): 
    area = (x(pt2)-x(pt1))*(y(pt3)-y(pt1))-(y(pt2)-y(pt1))*(x(pt3)-x(pt1))
    rv=0               # collinear
    if area<0: rv=-1   # clockwise
    elif area>0: rv=+1 # counter clockwise
    return rv 


## MAIN ----------------------------------------------------------------------

# --- DATA ------------
points= [ (9,2), (5,6), (3,10), (11,13), (7,6), (15,18), (19,16), (13,4), (23,8), (17,12)  ]

# variation: random points
#points=[]
#for i in range(1,1000): points.append( (random.randint(1,100), random.randint(1,100)) )

# variation: sedgewick's points (approx) 
#points=[]
# points= [(373, 234), (312, 235), (313, 128), (165, 164), (185, 91), (190, 34), 
#          (452, 249), (404, 109), (237, 297), (262, 130), (124, 127), (334, 29), (459, 2)]

# variation : random points, but normally distributed
#points=list(zip(
#    [ int(r) for r in np.random.normal(100, 20, 1000)],
#    [ int(r) for r in np.random.normal(100, 20, 1000)]
#    ) )
# ---------------------

# identify the point with the lowest y coordinate and put it first in the list of points
# special case: in case of tie, decide by lowest x coordinate
points=sorted( points, key=lambda a: a[0]) # secondary sort on x  
points=sorted( points, key=lambda a: a[1]) # primary sort on y (python sort is stable) 

# enrich the tuple list with the angle between the line through (point, first point) 
# and the horizontal 
points = [ (x(pt),y(pt),calc_angle(points[0],pt)) for pt in points ] 

# order the points by angle
points=sorted(points, key=lambda a: a[2])

# after sorting we don't need the angles anymore, let's drop them
points = [ (x(pt),y(pt)) for pt in points ] 

# the core of the graham scan: 
#   iterate over the points in ascending angle order
#   check if pt1-> pt2 -> pt3 makes a counterclockwise turn
#   if NOT, then discard pt2
hull=copy.deepcopy(points)  # deep copy, because we are going to delete points 
i=0
while True: 
    if clockwise(hull[i],hull[i+1],hull[i+2])<=0: # clockwise or collinear
        del hull[i+1]
        i-=1    # take a step back 
    else: 
        i+=1    # step forward
        if (i+2)>=len(hull): # are we done?
            break

print "Points on the hull" ,hull

# to close the polygon, (for drawing, and calculating distance) append the first point 
hull.append(hull[0])

print "Perimeter distance: ", sum( [ dist( hull[i],hull[i+1] ) for i in range(len(hull)-1) ] )

# plotting the points and the hull
plt.scatter(*zip(*points),s=20)      # plot all the points
plt.plot(*zip(*hull),linestyle='-')  # plot the hull
plt.savefig("convex_hull.png")
 
Notes by Data Munging Ninja. Generated on nini:sync/20151223_datamungingninja/convexhull at 2016-12-26 08:22