Skip to main content

Australian Address Matcher to Regions

Project description

Introduction

Addrmatcher is an open-source Python software for matching input string addresses to the most similar street addresses and the geo coordinates inputs to the nearest street addresses. The result provides not only the matched addresses, but also the respective country’s different levels of regions for instance - in Australia, government administrative regions, statistical areas and suburb in which the address belongs to.

The Addrmatcher library is built to work with rapidfuzz, scikit-learn, pandas, numpy and provides user-friendly output. It supports python version 3.6 and above. It runs on all popular operating systems, and quick to install and is free of charge.

In this initial release, the scope of input data and matching capability are limited to Australian addresses only. The Addrmatcher library will see the opportunity to scale the matching beyond Australia in future.

The package offers two matching capabilities -

  • address-based matching accepts string address as argument.
  • coordinate-based matching takes geo coordinate (latitude and longititude) as input.

The development team achieved the optimal speed of matching less than one second for each address and each pair of coordinate input.

The reference dataset is built upon GNAF(Geocoded National Address File) and ASGS(Australian Statistical Geography Standard) for the Australian addresses. The package users will require to download the optimised format of reference dataset into the working direcory once the package has been installed.

Installation

pip install addrmatcher

Data Download

Once the package has been installed, the reference dataset needs to be downloaded into the local current project working directory prior to implementation of the package's matching functions.

In the command line interface,

addrmatcher-data aus

The above console script will download the dataset which is currently hosted in Github into the user's directory. By default, the country is Australia and Australia physical addresses will be downloaded. After executing the command, the 37 parquet files will be stored in directories for example /data/Australia/*.parquet.

Import the package and classes

# Import the installed package
from addrmatcher import AUS, GeoMatcher

# Initialise the geo region as AUS
matcher = GeoMatcher(AUS)

Example - Address-based Matching

matched_address = matcher.get_region_by_address("9121, George Street, North Strathfield, NSW 2137")
print(matched_address)

>{'SA4_NAME_2016': ['Sydney - Inner West'],
 'LGA_NAME_2016': ['Canada Bay (A)'],
 'SA3_NAME_2016': ['Canada Bay'],
 'RATIO': [100.0],
 'STATE': ['NSW'],
 'FULL_ADDRESS': ['9121 GEORGE STREET NORTH STRATHFIELD NSW 2137'],
 'SA2_NAME_2016': ['Concord West - North Strathfield'],
 'SSC_NAME_2016': ['North Strathfield'],
 'MB_CODE_2016': ['11205258900'],
 'SA1_7DIGITCODE_2016': ['1138404']}

Example - Coordinate-based Matching

nearest_address = matcher.get_region_by_coordinates(-29.1789874, 152.628291)
print(nearest_address)

>{'IDX': [129736],
 'FULL_ADDRESS': ['3 7679 CLARENCE WAY MALABUGILMAH NSW 2460'],
 'LATITUDE': [-29.17898685],
 'LONGITUDE': [152.62829132],
 'LGA_NAME_2016': ['Clarence Valley (A)'],
 'SSC_NAME_2016': ['Baryulgil'],
 'SA4_NAME_2016': ['Coffs Harbour - Grafton'],
 'SA3_NAME_2016': ['Clarence Valley'],
 'SA2_NAME_2016': ['Grafton Region'],
 'SA1_7DIGITCODE_2016': ['1108103'],
 'MB_CODE_2016': ['11205732700'],
 'STREET_NAME': ['CLARENCE'],
 'STREET_TYPE_CODE': ['WAY'],
 'LOCALITY_NAME': ['MALABUGILMAH'],
 'STATE': ['NSW'],
 'POSTCODE': ['2460'],
 'ADDRESS_DETAIL_PID': ['GANSW706638188'],
 'FILE_NAME': ['NSW-10.parquet'],
 'DISTANCE': [6.859565028181215e-05]}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

addrmatcher-0.0.2.5.10.tar.gz (18.7 kB view details)

Uploaded Source

Built Distribution

addrmatcher-0.0.2.5.10-py3-none-any.whl (19.2 kB view details)

Uploaded Python 3

File details

Details for the file addrmatcher-0.0.2.5.10.tar.gz.

File metadata

  • Download URL: addrmatcher-0.0.2.5.10.tar.gz
  • Upload date:
  • Size: 18.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.0 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.9.5

File hashes

Hashes for addrmatcher-0.0.2.5.10.tar.gz
Algorithm Hash digest
SHA256 f0d3e1bfaf06d1344b595449e6b1dfd1b33bb8ff62700a1cd84a924ff4eb2d95
MD5 cc27cea70bb55f1ea1aa7bc5002849ea
BLAKE2b-256 0183b5f379f51c45c2827c24cb6b11c6b9774f688f99dfd6a22611556867fb81

See more details on using hashes here.

File details

Details for the file addrmatcher-0.0.2.5.10-py3-none-any.whl.

File metadata

  • Download URL: addrmatcher-0.0.2.5.10-py3-none-any.whl
  • Upload date:
  • Size: 19.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.0 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.9.5

File hashes

Hashes for addrmatcher-0.0.2.5.10-py3-none-any.whl
Algorithm Hash digest
SHA256 7f30da32adb997f7acadfa6c7fdb4a87dceaec1f7dc6c9c59d58e73cfb08785b
MD5 4ee9e33403365b61a06f5bbc60b8ac6e
BLAKE2b-256 b5efcbf9a3e119c778f4396a2992c3bf89208250894d1b344f69ff77a4016507

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page