Skip to main content

Open source geocoding in Python

Project description

Whereabouts

Fast, scalable geocoding for Python using DuckDB. The geocoding algorithms are based on the following papers:

Description

Geocode addresses and reverse geocode coordinates directly from Python in your own environment.

  • No additional database setup required. Uses DuckDB to run all queries
  • No need to send data to an external geocoding API
  • Fast (Geocode 1000s / sec and reverse geocode 200,000s / sec)
  • Robust to typographical errors

Requirements

  • Python 3.8+
  • Poetry (for package management)

Installation

Once Poetry is installed and you are in the project directory:

poetry shell
poetry install

Create a geocoder database

To start geocoding, a geocoding database has to be created, which uses a reference dataset containing addresses and corresponding latitude, longitude values.

The reference file should be a single csv file with at least three fields: the complete address, latitude, longitude. These fields should be specified in a setup.yml file. An example is included.

Once the setup.yml is created and a reference dataset is available, the geocoding database can be created using the setup_geocoder function from whereabouts.utils.

The current process for using Australian data from the GNAF is as follows:

  1. Download the latest version of GNAF core from https://geoscape.com.au/data/g-naf-core/
  2. Update the setup.yml file to point to the location of the GNAF core file
  3. Finally, setup the geocoder. This creates the required reference tables
python -m whereabouts setup_geocoder setup.yml

To use address data from another country, the file should have the following columns:

Column name Description
ADDRESS_DETAIL_PID Unique identifier for address
ADDRESS_LABEL The full address
ADDRESS_SITE_NAME Name of the site. This is usually null
LOCALITY_NAME Name of the suburb or locality
POSTCODE Postcode of address
STATE State
LATITUDE Latitude of geocoded address
LONGITUDE Longitude of geocoded address

Examples

Geocode a list of addresses

from whereabouts.Matcher import Matcher

matcher = Matcher(db_name='gnaf_au')
matcher.geocode(addresslist, how='standard')

For more accurate geocoding you can use trigram phrases rather than token phrases (note that the trigram option has to have been specified in the setup.yml file as part of the setup)

matcher.geocode(addresslist, how='trigram')

Once a Matcher object is created, the KD-tree for fast geocoding will also be created. A list of latitude, longitude values can then be reverse geocoded as follows

matcher.reverse_geocode(coordinates)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

whereabouts-0.3.6.tar.gz (14.1 kB view details)

Uploaded Source

Built Distribution

whereabouts-0.3.6-py3-none-any.whl (27.2 kB view details)

Uploaded Python 3

File details

Details for the file whereabouts-0.3.6.tar.gz.

File metadata

  • Download URL: whereabouts-0.3.6.tar.gz
  • Upload date:
  • Size: 14.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.12.2 Darwin/23.4.0

File hashes

Hashes for whereabouts-0.3.6.tar.gz
Algorithm Hash digest
SHA256 3a46a2ffad3bd9827415c8776d056edba5e680dbcaef03e55543f31b013a72ec
MD5 32d1f96b6c4d60d0adc3f65b84ded987
BLAKE2b-256 62f3fdb608d76b32db4d474d5bd79f670d5fd51657ba3a535528e92281b6259c

See more details on using hashes here.

File details

Details for the file whereabouts-0.3.6-py3-none-any.whl.

File metadata

  • Download URL: whereabouts-0.3.6-py3-none-any.whl
  • Upload date:
  • Size: 27.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.12.2 Darwin/23.4.0

File hashes

Hashes for whereabouts-0.3.6-py3-none-any.whl
Algorithm Hash digest
SHA256 6b4fdcc96e62a7ac12c3f92ab67b9ee41d6fa8e656220077a1d9c1c42e657c9c
MD5 6559e5af1c432ff04bdf080599a1d82f
BLAKE2b-256 68e098a89fd4041a98e6bce2678af7b7767894ec73c6be267d1d283379cfb46a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page