Convex Hull

01_intro
20161203

# Intro

You have a bunch of points, and you want to know the line that connects the hull around those points. Eg. you want to calculate the minimum length of the wire fence to fence off a field with a number of objects (static sheep? garden gnomes? stuffed cows?) on it.

Solution: this is a convex hull problem (decide which points make up the hull), for which Sedgewick presented the Graham Scan in the MOOC 'Algorithms I' (and the corresponding text books).

For more detail watch Sedgewick's lecture, the Graham scan algorithm explanation starts at around 4m30s: www.youtube.com/watch?v=aSjaCvJ8lzM

Briefly explained the algorithm works as follows:

• choose a starting point: the point with the lowest y coordinate
• for all the other points calculate the angle they make with the starting point (to be exact: the angle between the lines through these two points and the horizontal line)
• order the points by angle (ascending)
• iterate over the points:
• if the motion from point 1 to 2 to 3 is a clockwise motion, discard point 2, and take a step back
• otherwise move to the next set of 3 points
02_clockwise
20161205

# The clockwise function

As you gathered from the outline of the algorithm in the introduction, we need to write a function to decide whether the motion from point 1 to point 2 to a point 3 is clockwise or counterclockwise or collinear. Let's focus on that first.

## Triange area

The area of a triangle can be calculated from following determinant:

It happens to be so that when the area is negative, the motion from point 1 -> 2 -> 3 is a counterclockwise motion. When the calculated area is positive, the motion from point 1 -> 2 -> 3 is a clockwise motion. And when it's zero? Yes, you've guessed it: the points fall on the same line, ie. they are collinear.

.

## Example Program

The following program is for inspecting whether the clockwise function works as expected: it draws a number of cases, with p1 always at (50,0), p2 at ( 20..80, 50), and p3 random. Depending on clockwise or counterclockwise, the line is drawn in either red or green, to make it possible to visually inspect the correctness.

 ```1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 ``` ``````#!/usr/bin/python import matplotlib.pyplot as plt import random def x(tp): return tp[0] def y(tp): return tp[1] # return 'g', 'b','r' respectively if counterclockwise, collinear, or clockwise def clockwise(pt1, pt2, pt3): half_area = (x(pt2)-x(pt1))*(y(pt3)-y(pt1))-(y(pt2)-y(pt1))*(x(pt3)-x(pt1)) rv='b' # collinear if half_area<0: rv='r' # clockwise elif half_area>0: rv='g' # counter clockwise return rv p1=(50.0,0.0) for i in range(7): p2= ( 50+(i-3)*10, 50) sgn=random.sample([-1,1],1)[0] p3= ( 20+random.randint(1,60), 50+ sgn*(10 + random.randint(1,30)) ) col=clockwise(p1,p2,p3) plt.plot(*zip(*[p1,p2,p3]), color=col, linewidth=2.0, marker='o') plt.show() ``````

Note: the x() and y() functions are convenience functions to get the x and y element from a point (a tuple with 2 elements), because this formula is easier to read ..

``half_area = (x(pt2)-x(pt1))*(y(pt3)-y(pt1))-(y(pt2)-y(pt1))*(x(pt3)-x(pt1))``

.. than this one ..

``half_area = (pt2[0]-pt1[0])*(pt3[1]-pt1[1])-(pt2[1]-pt1[1])*(pt3[0])-pt1[0])``

### Example Plot

The following is a plot generated by the above program. When inspecting clockwise or not, start with P1 at the bottom, and choose a line going up to P2, and continue to P3. Is it a clockwise motion? The line should be in red then.

03_core
20161206

## My Implementation

### Utility functions

• calc_angle(pt1,pt2): calculate the angle between a line defined by two points and the horizontal
• dist(pt1,pt2): calculate the Euclidean distance between two points
• clockwise(): more detail in prior section
 ```1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 ``` ``````# convenience functions to get the x,y value from a tuple, # and avoid using the wrong tuple index # (sacrifice a bit of performance for legibility) def x(tp): return tp[0] def y(tp): return tp[1] # calculate the angle for a line between two points and the horizontal def calc_angle(pt1, pt2): dy=float( y(pt2) - y(pt1) ) dx=float( x(pt2) - x(pt1) ) angle=0.0 if dx!=0: angle=math.atan(dy/dx) else: if dy==0: angle=0.0 elif dy<0: angle=math.atan(float('-inf')) else: angle=math.atan(float('inf')) # if the calculated angle is negative, make it positive while (angle<0): angle=angle+2*math.pi; return angle # calculate the distance between two points def dist(pt1,pt2): dx=float( x(pt2) - x(pt1) ) dy=float( y(pt2) - y(pt1) ) return math.sqrt( dx*dx + dy*dy) # does moving from point pts[i0] -> pts[i1] -> i2 describe a counterclockwise motion ? def clockwise(pt1, pt2, pt3): area = (x(pt2)-x(pt1))*(y(pt3)-y(pt1))-(y(pt2)-y(pt1))*(x(pt3)-x(pt1)) rv=0 # collinear if area<0: rv=-1 # clockwise elif area>0: rv=+1 # counter clockwise return rv``````

Main body

Data: the location of the sheep.

 ```1 ``` ``points= [ (9,2), (5,6), (3,10), (11,13), (7,6), (15,18), (19,16), (13,4), (23,8), (17,12) ]``

Step 1: identify the starting point

 ```1 2 3 4 ``` ``````# identify the point with the lowest y coordinate and put it first in the list of points # special case: in case of a tie, decide by lowest x coordinate points=sorted( points, key=lambda a: a[0]) # secondary sort on x points=sorted( points, key=lambda a: a[1]) # primary sort on y (python sort is stable) ``````

Step 2: put the points in the correct order

 ```1 2 3 4 5 6 7 8 9 ``` ``````# enrich the tuple list with the angle between the line through (point, first point) # and the horizontal points = [ (x(pt),y(pt),calc_angle(points[0],pt)) for pt in points ] # order the points by angle points=sorted(points, key=lambda a: a[2]) # after sorting we don't need the angles anymore, let's drop them points = [ (x(pt),y(pt)) for pt in points ]``````

Step 3: the core of the Graham scan

• iterate over the points in ascending angle order
• check if pt1-> pt2 -> pt3 makes a counterclockwise turn
• if NOT, then discard pt2
 ```1 2 3 4 5 6 7 8 9 10 ``` ``````hull=copy.deepcopy(points) # deep copy, because we are going to delete points i=0 while True: if clockwise(hull[i],hull[i+1],hull[i+2])<=0: # clockwise or collinear del hull[i+1] i-=1 # take a step back else: i+=1 # step forward if (i+2)>=len(hull): # are we done? break``````

Epilogue: print the list of points on the hull, calculate the perimeter and plot the points

 ```1 2 3 4 5 6 7 8 9 10 11 12 ``` ``````print "Points on the hull" ,hull # to close the polygon, (for drawing, and calculating distance) append the first point hull.append(hull[0]) print "Perimeter distance: ", sum( [ dist( hull[i],hull[i+1] ) for i in range(len(hull)-1) ] ) # plotting the points and the hull plt.scatter(*zip(*points),s=20) # plot all the points plt.plot(*zip(*hull),linestyle='-') # plot the hull plt.savefig("convex_hull.png")``````
04_run
20161206

### Run 1

Input:

``points= [ (9,2),(5,6),(3,10),(11,13),(7,6),(15,18),(19,16),(13,4),(23,8),(17,12)  ]``

Output:

``````Points on the hull [(9, 2), (23, 8), (19, 16), (15, 18), (3, 10), (5, 6)]
Perimeter distance:  53.1991493831``````

Solution: the length of the wire we need to fence off the sheep is: 53.20.

### Run 2

Check if the convex hull implementation is correct by taking as input a list of 1000 random points:

Conclusion: yes it's correct, all points fall within or on the convex hull.

### Run 3

And finally, let's take as input a list of 1000 random points from a normal distribution:

### Alternative implementation

Here is a different implementation of the core Graham scan, the same as implemented by Sedgewick in algs4.cs.princeton.edu/99hull/GrahamScan.java.html.

Because of the stack operations I find this implementation more difficult to understand, but its big advantage is that it doesn't need to copy the whole points list.

 ```1 2 3 4 5 6 7 8 9 ``` ``````hulli=[] hulli.append(0) hulli.append(1) for i in range(2,len(points)): top=hulli.pop() while clockwise( points[hulli[-1]], points[top], points[i])<=0: top=hulli.pop() hulli.append(top) hulli.append(i)``````
05_source
20161206
 ```1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 ``` ``````#!/usr/bin/python import math import matplotlib.pyplot as plt import copy import random # only needed for random points import numpy as np # only needed for random points from normal distrib # convenience functions to get the x,y value from a tuple, # and avoid using the wrong tuple index # (sacrifice a bit of performance for legibility) def x(tp): return tp[0] def y(tp): return tp[1] # calculate the angle for a line between two points and the horizontal def calc_angle(pt1, pt2): dy=float( y(pt2) - y(pt1) ) dx=float( x(pt2) - x(pt1) ) angle=0.0 if dx!=0: angle=math.atan(dy/dx) else: if dy==0: angle=0.0 elif dy<0: angle=math.atan(float('-inf')) else: angle=math.atan(float('inf')) # if the calucated angle is negative, make it positive while (angle<0): angle=angle+2*math.pi; return angle # calculate the distance between two points def dist(pt1,pt2): dx=float( x(pt2) - x(pt1) ) dy=float( y(pt2) - y(pt1) ) return math.sqrt( dx*dx + dy*dy) # does moving from point pts0 -> pt1 -> pt2 describe a counterclockwise motion ? def clockwise(pt1, pt2, pt3): area = (x(pt2)-x(pt1))*(y(pt3)-y(pt1))-(y(pt2)-y(pt1))*(x(pt3)-x(pt1)) rv=0 # collinear if area<0: rv=-1 # clockwise elif area>0: rv=+1 # counter clockwise return rv ## MAIN ---------------------------------------------------------------------- # --- DATA ------------ points= [ (9,2), (5,6), (3,10), (11,13), (7,6), (15,18), (19,16), (13,4), (23,8), (17,12) ] # variation: random points #points=[] #for i in range(1,1000): points.append( (random.randint(1,100), random.randint(1,100)) ) # variation: sedgewick's points (approx) #points=[] # points= [(373, 234), (312, 235), (313, 128), (165, 164), (185, 91), (190, 34), # (452, 249), (404, 109), (237, 297), (262, 130), (124, 127), (334, 29), (459, 2)] # variation : random points, but normally distributed #points=list(zip( # [ int(r) for r in np.random.normal(100, 20, 1000)], # [ int(r) for r in np.random.normal(100, 20, 1000)] # ) ) # --------------------- # identify the point with the lowest y coordinate and put it first in the list of points # special case: in case of tie, decide by lowest x coordinate points=sorted( points, key=lambda a: a[0]) # secondary sort on x points=sorted( points, key=lambda a: a[1]) # primary sort on y (python sort is stable) # enrich the tuple list with the angle between the line through (point, first point) # and the horizontal points = [ (x(pt),y(pt),calc_angle(points[0],pt)) for pt in points ] # order the points by angle points=sorted(points, key=lambda a: a[2]) # after sorting we don't need the angles anymore, let's drop them points = [ (x(pt),y(pt)) for pt in points ] # the core of the graham scan: # iterate over the points in ascending angle order # check if pt1-> pt2 -> pt3 makes a counterclockwise turn # if NOT, then discard pt2 hull=copy.deepcopy(points) # deep copy, because we are going to delete points i=0 while True: if clockwise(hull[i],hull[i+1],hull[i+2])<=0: # clockwise or collinear del hull[i+1] i-=1 # take a step back else: i+=1 # step forward if (i+2)>=len(hull): # are we done? break print "Points on the hull" ,hull # to close the polygon, (for drawing, and calculating distance) append the first point hull.append(hull[0]) print "Perimeter distance: ", sum( [ dist( hull[i],hull[i+1] ) for i in range(len(hull)-1) ] ) # plotting the points and the hull plt.scatter(*zip(*points),s=20) # plot all the points plt.plot(*zip(*hull),linestyle='-') # plot the hull plt.savefig("convex_hull.png")``````

Notes by Data Munging Ninja. Generated on nini:sync/20151223_datamungingninja/convexhull at 2016-12-26 08:22