The Python Book | |||||
Latest All — By Topic 2019 2016 2015 2014 Topics:
angle argsort beautifulsoup binary bisect clean collinear covariance cut_paste_cli datafaking dataframe datetime day_of_week delta_time df2sql doctest exif floodfill fold format frequency gaussian geocode httpserver is join legend linalg links matrix max namedtuple null numpy oo osm packaging pandas plot point range regex repeat reverse sample_words shortcut shorties sort stemming strip_html stripaccent tools visualization zen zip Tags:
3d aggregation angle archive argsort atan beautifulsoup binary bisect class clean collinear colsum comprehension count covariance csv cut_paste_cli datafaking dataframe datetime day_of_week delta_time deltatime df2sql distance doctest dotproduct dropnull exif file floodfill fold format formula frequency function garmin gaussian geocode geojson gps groupby html httpserver insert ipython is join kfold legend linalg links magic matrix max min namedtuple none null numpy onehot oo osm outer_product packaging pandas plot point quickies range read_data regex repeat reverse sample sample_data sample_words shortcut shorties sort split sqlite stack stemming string strip_html stripaccent tools track tuple visualization zen zip zipfile |
geocode
20161004
Incrementally update geocoded dataStartpoint: we have a sqlite3 database with (dirty) citynames and country codes. We would like to have the lat/lon coordinates, for each place. Here some sample data, scraped from the Brussels marathon-results webpage, indicating the town of residence and nationality of the athlete:
But sometimes the town and country don't match up, eg:
For the geocoding we make use of nominatim.openstreetmap.org. The 3-letter country code also needs to be translated into a country name, for which we use the file The places for which we don't get valid (lat,lon) coordinates we put (0,0). We run this script multiple times, small batches in the beginning, to be able see what the exceptions occur, and bigger batches in the end (when problems have been solved). In between runs the data may be manually modified by opening the sqlite3 database, and updating/deleting the t_geocode table.
A few queriesTotal number of places:
Number of places for which we didn't find valid coordinates :
|