Skip to main content

Pyap2 is a maintained fork of pyap, a regex-based library for parsing US, CA, and UK addresses. The fork adds typing support, handles more address formats and edge cases.

Project description

Pyap2 is a maintained fork of Pyap, a regex-based python library for detecting and parsing addresses. Currently it supports US 🇺🇸, Canadian 🇨🇦 and British 🇬🇧 addresses.

>>> import pyap
>>> test_address = """
    Lorem ipsum
    225 E. John Carpenter Freeway,
    Suite 1500 Irving, Texas 75062
    Dorem sit amet
    """
>>> addresses = pyap.parse(test_address, country='US')
>>> for address in addresses:
        # shows found address
        print(address)
        # shows address parts
        print(address.as_dict())
...

Installation

To install Pyap2, simply:

$ pip install pyap2

About

We started improving the original pyap by adopting poetry and adding typing support. It was extensively tested in web-scraping operations on thousands of US addresses. Gradually, we added support for many rarer address formats and edge cases, as well as the ability to parse a partial address where only street info is available.

Typical workflow

Pyap should be used as a first thing when you need to detect an address inside a text when you don’t know for sure whether the text contains addresses or not.

Limitations

Because Pyap2 (and Pyap) is based on regular expressions it provides fast results. This is also a limitation because regexps intentionally do not use too much context to detect an address.

In other words in order to detect US address, the library doesn’t use any list of US cities or a list of typical street names. It looks for a pattern which is most likely to be an address.

For example the string below would be detected as a valid address: “1 SPIRITUAL HEALER DR SHARIF NSAMBU SPECIALISING IN”

It happens because this string has all the components of a valid address: street number “1”, street name “SPIRITUAL HEALER” followed by a street identifier “DR” (Drive), city “SHARIF NSAMBU SPECIALISING” and a state name abbreviation “IN” (Indiana).

The good news is that the above mentioned errors are quite rare.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyap2-0.2.3.tar.gz (19.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyap2-0.2.3-py3-none-any.whl (23.0 kB view details)

Uploaded Python 3

File details

Details for the file pyap2-0.2.3.tar.gz.

File metadata

  • Download URL: pyap2-0.2.3.tar.gz
  • Upload date:
  • Size: 19.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pyap2-0.2.3.tar.gz
Algorithm Hash digest
SHA256 fc02a9b3d6742b5712d71d3f0809906c42c03e43d1f19261e9c457c5cf7c31cf
MD5 61d1f1759940e8d94b52c2ffd343d3b7
BLAKE2b-256 b8aed98d4144511efd41dc5704a43e25b3f697f73260094a7f9012d0c88174e2

See more details on using hashes here.

Provenance

The following attestation bundles were made for pyap2-0.2.3.tar.gz:

Publisher: publish-to-pypi.yml on argyle-engineering/pyap

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pyap2-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: pyap2-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 23.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pyap2-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 81a592c0c6b2204d6246bd3c58cef27dacbca6701a56e40c8e86cd80a6e35860
MD5 164e4c66e6ac1170dcf53efc610c6460
BLAKE2b-256 5af902778989195b7c05f0c9198606ca25fe50b2148bd351b192a1815f69ce50

See more details on using hashes here.

Provenance

The following attestation bundles were made for pyap2-0.2.3-py3-none-any.whl:

Publisher: publish-to-pypi.yml on argyle-engineering/pyap

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page