Skip to main content

An open-source tool for linking free-text addresses to UPRN

Project description

PyPI PyPI - Downloads

FLAP

FLAP is an open-source tool for linking free-text addresses to Ordinance Survey Unique Property Reference Number (OS UPRN). You need to have a licence of OS UPRN and download the address premium product to use FLAP FLAP can be used at scale with a few lines of syntax.

Quick start of FLAP tool

Installation

We recommend you to create a virtual environment with venv.

python3 -m venv [YOUR_PATH]/flap_lite
source [YOUR_PATH]/flap_lite/bin/activate

Install with pip:

pip install --upgrade flap-lite

For now, please contact the developer for downloading the trained model. Copy the model to [YOUR_PATH]/flap_lite/lib/python3.9/site-packages/flap/model/

cp [PATH_TO_MODEL_FILE] [YOUR_PATH]/flap_lite/lib/python3.9/site-packages/flap/model/

Quick Start

Building the database

Use flap.create_database for building the database.

from flap import create_database

create_database(db_path=[PATH_FOR_THE_DB], raw_db_file=[PATH_TO_DB_ZIP])

Matching

Use flap.match for matching address to database

from flap import match

input_csv = '[PATH_TO_INPUT_CSV_FILE]'
db_path = '[PATH_TO_THE_DB]'

results = match(
    input_csv=input_csv,
    db_path=db_path
    )

Matching results will be saved to [$pwd]/output.csv by default. By default, FLAP uses all available CPUs and process the addresses in batches of 10,000.

Some useful options are:

  • batch_size for number of addresses in each batch
  • max_workers for CPU cores used
  • in_memory_db for if in-memory SQLite is used

How does it work?

Briefly, FLAP parses the structured parts of addresses (e.g. POSTCODE "AB12 3CD"). And all the deterministic parts (e.g. numbers "111", letters "A")

An SQL query is made based on the parsed fields to narrow down to a few rows in the database.

select * from indexed where POSTCODE='AB12 3CD'

Features are generated:

  • For the deterministic parts: pairwise comparison to see if equal
  • Linear assignment alignment for the textual parts
  • For postcode: comparison to see if parts are equal

A trained Random Forest Classifier predict a score based on the generated feature. The address with best score is deemed as a match.

The above is a simplified description.

Coming soon

  • Command Line Interface
  • More documentation
  • Dummy database for trying it out with an example notebook

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flap-lite-0.6.20.tar.gz (78.0 kB view details)

Uploaded Source

Built Distribution

flap_lite-0.6.20-py3-none-any.whl (84.7 kB view details)

Uploaded Python 3

File details

Details for the file flap-lite-0.6.20.tar.gz.

File metadata

  • Download URL: flap-lite-0.6.20.tar.gz
  • Upload date:
  • Size: 78.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.12

File hashes

Hashes for flap-lite-0.6.20.tar.gz
Algorithm Hash digest
SHA256 6155e9115dac789483198654c53b15a7d7e67f0fc0ec20b615965b6f3ad540a2
MD5 76c982170bf8f7a9b674730f648784e8
BLAKE2b-256 9c1d1b7a08a1c858b678f34efb864348a123a0d2100bb4605d101b2aceda3a9b

See more details on using hashes here.

File details

Details for the file flap_lite-0.6.20-py3-none-any.whl.

File metadata

  • Download URL: flap_lite-0.6.20-py3-none-any.whl
  • Upload date:
  • Size: 84.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.12

File hashes

Hashes for flap_lite-0.6.20-py3-none-any.whl
Algorithm Hash digest
SHA256 00ce47e618d6af0d2058b7ee3a7638e1b663ffe85bf0ed925a8be0d6c05038d2
MD5 5f8b71b825cbd18e071dce32395fefd7
BLAKE2b-256 b4a90df49c59b307375e50ff9f65fccc1146105d6617b9e6ec8bd80d391f9a5e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page