Skip to main content

A Python 3 package for classifying geolocation data.

Project description

# polypoint
A Python 3 package for classifying geolocation data.

## Installation
pip3 install polypoint

## Example Use
Unless you provide a correctly formatted XML sheet (see states.xml for an example), polypoint defaults to using the 50 states of America as its polygons.
from polypoint import polypoint
classifier = polypoint.PolygonClassifier()
# point_list containing tuples of latitude and longitude of the form:
# [(39.962245, -83.000647), (42.659829, -73.781339), ... ]
point_list = [(39.962245, -83.000647), (42.659829, -73.781339)]

# returns a list of polygon indices of the form:
# [34, 31, ... ]
results = classifier.match_points_to_polygon(point_list)

# returns a list of polygon names of the form:
# ['Ohio', 'New York', ... ]
results_string = classifier.get_polygon_names(results)

## Public Interface

`classifier.match_points_to_polygon(point_list)` accepts a list of coordinates to classify and returns a Numpy array of integers which are indices in `classifier.conf.name_list`. These integers are easily converted to polygon names with the `classifier.get_polygon_names(result_list)` method.

`classifier.get_polygon_names(result_list)` accepts a list of integers and returns a list with the corresponding strings from the list `classifier.conf.name_list`

## Initialization and Configuration

Each PolygonClassifier instance has a Configuration class (see as an instance variable, accessible as `classifier.conf`

There are two configuration initialization options for a classifier
`classifier.initialize()` will perform a full initialization of the classifier configuration. See the Configuration class in the file. Some of more important options are `parallel_enabled`, `num_cpu`, `parallel_limit`, and `trimmed_polygons`.

`classifier.initialize_min()` will perform a minimal initialization of the classifier configuration, with no optimization and using no parallel processing. Useful if only working with small lists at once.

`classifier.conf.parallel_enabled` is a boolean that will enable or disable parallel computation on the input.

`classifier.conf.num_cpu` is an integer value for the number of cores your processor has. This controls how many processes are started to perform computation on the input. If left unconfigured, this defaults to the number of detected cores in your processor. (Note that this may read the number of virtual cores)

`classifier.conf.parallel_limit` is an integer lower limit on the size of the input that triggers parallel computation. For example, if set to 1000 then the input list must be of size greater than 1000 to start parallel computation. For lists of size 1000 or less, the result will be computed sequentially. Defaults to 0 (meaning that if `parallel_enabled` is set to `True`, all input will use multiple processes). If you are repeatedly inputting small lists, it may be worth disabling parallel computation or finding at what list size multiprocessing is faster/slower and setting that value here.

`classifier.conf.trimmed_polygons` is a list of polygons corresponding to the classifying polygons. For example, if the 50 states of America are being used, this list will have 50 corresponding polygons. All of the trimmed polygons will be contained by the original polygons and are used to speed up the search. See below for examples. Blue is the original polygon and red is the trimmed polygon.

![Georgia with trimmed interior polygon]( "Georgia with trimmed interior polygon")

![Georgia with trimmed interior polygon]( "New York with trimmed interior polygon")

![Texas with trimmed interior polygon]( "Texas with trimmed interior polygon")

### MIT License

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for polypoint, version 0.18
Filename, size File type Python version Upload date Hashes
Filename, size polypoint-0.18.tar.gz (38.4 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page