Open source geocoding in Python
Project description
Whereabouts
Fast, scalable geocoding for Python using DuckDB. The geocoding algorithms are based on the following papers:
Description
Geocode addresses and reverse geocode coordinates directly from Python in your own environment.
- No additional database setup required. Uses DuckDB to run all queries
- No need to send data to an external geocoding API
- Fast (Geocode 1000s / sec and reverse geocode 200,000s / sec)
- Robust to typographical errors
Requirements
- Python 3.8+
- requirements.txt (found in repo)
Installation: via PIP
whereabouts can be installed either from this repo using pip / uv / conda
pip install whereabouts
Download a geocoder database or create your own
You will need a geocoding database to match addresses against. You can either download a pre-built database or create your own using a dataset of high quality reference addresses for a given country, state or other geographic region.
Option 1: Download a geocoder database
Pre-built geocoding database are available from Huggingface. The list of available databases can be found here
As an example, to install the small size geocoder database for all of Australia:
python -m whereabouts download au_all_sm
Geocoding examples
Geocode a list of addresses
from whereabouts.Matcher import Matcher
matcher = Matcher(db_name='au_all_sm')
matcher.geocode(addresslist, how='standard')
For more accurate geocoding you can use trigram phrases rather than token phrases. Note you will need one of the large databases to use trigram geocoding.
matcher.geocode(addresslist, how='trigram')
Option 2: Create a geocoder database
Rather than using a pre-built database, you can create your own geocoder database if you have your own address file. This file should be a single csv or parquet file with the following columns:
Column name | Description | Data type |
---|---|---|
ADDRESS_DETAIL_PID | Unique identifier for address | int |
ADDRESS_LABEL | The full address | str |
ADDRESS_SITE_NAME | Name of the site. This is usually null | str |
LOCALITY_NAME | Name of the suburb or locality | str |
POSTCODE | Postcode of address | int |
STATE | State | str |
LATITUDE | Latitude of geocoded address | float |
LONGITUDE | Longitude of geocoded address | float |
These fields should be specified in a setup.yml
file. Once the setup.yml
is created and a reference dataset is available, the geocoding database can be created:
python -m whereabouts setup_geocoder setup.yml
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for whereabouts-0.3.13-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dd10b1cfc9d3a5af7c014bac142a2c8603e86b99ea8a05ed1c719fbbdf5601b6 |
|
MD5 | 316dba854c4fa0ad6c82d86702e43e84 |
|
BLAKE2b-256 | bf8e58760ec5bebc3a6e15c673cdf3091b7a9a6fc1df9098878b700428a01921 |