Skip to main content

US address parsing and normalization library using libpostal

Project description

USPS Address Normalizer

A Python library for parsing and normalizing US addresses to USPS standard format using libpostal's intelligent parsing.

Features

Intelligent Parsing - Uses libpostal ML model to understand address components ✅ USPS Compliant - Applies all USPS Publication 28 standardization rules ✅ Complete Abbreviations - Street suffixes, directionals, secondary units, states ✅ Separate Components - Returns city, state, and ZIP as individual fields ✅ Proper Line Formatting - Separates primary (Line 1) and secondary (Line 2) addressing ✅ Clean Output - ALL CAPS, no periods, normalized spacing

Installation

Prerequisites

Install libpostal C library:

brew install libpostal

Install the Package

# From wheel file
pip install usps_address_normalizer-1.0.0-py3-none-any.whl

# Or from source
pip install .

Usage

Basic Usage

from usps_address_normalizer import normalize_address

# Normalize any address format
address = normalize_address("128 king street floor 3 suite 301 san francisco california 94107")

# Access individual components
print(address.line1)           # 128 KING ST
print(address.line2)           # FL 3 STE 301
print(address.city)            # SAN FRANCISCO
print(address.state)           # CA
print(address.zip_code)        # 94107

Computed Properties

# Get combined city, state, ZIP
print(address.city_state_zip)
# Output: SAN FRANCISCO, CA 94107

# Get full formatted address
print(address.full_address)
# Output:
# 128 KING ST
# FL 3 STE 301
# SAN FRANCISCO, CA 94107

Dictionary Export

# Convert to dictionary
data = address.as_dict()
print(data)
# Output:
# {
#     'line1': '128 KING ST',
#     'line2': 'FL 3 STE 301',
#     'city': 'SAN FRANCISCO',
#     'state': 'CA',
#     'zip_code': '94107',
#     'city_state_zip': 'SAN FRANCISCO, CA 94107',
#     'full_address': '128 KING ST\nFL 3 STE 301\nSAN FRANCISCO, CA 94107'
# }

Examples

Simple Address (No Unit)

address = normalize_address("100 Market Street San Francisco CA 94105")

print(address.line1)    # 100 MARKET ST
print(address.line2)    # (empty string)
print(address.city)     # SAN FRANCISCO
print(address.state)    # CA
print(address.zip_code) # 94105

Complex Address with Multiple Units

address = normalize_address("128 N King St, Floor 3, Suite 301, San Francisco, CA 94107")

print(address.line1)    # 128 N KING ST
print(address.line2)    # FL 3 STE 301
print(address.city)     # SAN FRANCISCO
print(address.state)    # CA
print(address.zip_code) # 94107

With Full State Name

address = normalize_address("456 Main Avenue Apartment 2B New York New York 10001")

print(address.line1)    # 456 MAIN AVE
print(address.line2)    # APT 2B
print(address.city)     # NEW YORK
print(address.state)    # NY (automatically converted)
print(address.zip_code) # 10001

PO Box

address = normalize_address("PO Box 1234 San Francisco CA 94107")

print(address.line1)    # PO BOX 1234
print(address.line2)    # (empty string)
print(address.city)     # SAN FRANCISCO
print(address.state)    # CA
print(address.zip_code) # 94107

USPS Standardization Rules

The library applies all official USPS abbreviations:

Street Suffixes

  • STREET → ST
  • AVENUE → AVE
  • BOULEVARD → BLVD
  • DRIVE → DR
  • And 60+ more...

Directionals

  • NORTH → N
  • SOUTH → S
  • NORTHEAST → NE
  • And all 8 directionals...

Secondary Units

  • SUITE → STE
  • APARTMENT → APT
  • FLOOR → FL
  • BUILDING → BLDG
  • And 20+ more...

States

  • CALIFORNIA → CA
  • NEW YORK → NY
  • All 50 states + territories

Formatting Rules

  • ALL CAPS for all address fields
  • Remove periods (P.O. → PO)
  • Remove extra spaces
  • Standardize abbreviations

Address Line Format

Line 1 (Primary Address):

  • Format: [Number] [PreDir] [Street Name] [Suffix] [PostDir]
  • Example: 128 N KING ST

Line 2 (Secondary Address):

  • Format: [Unit Type] [Unit Number]
  • Example: FL 3 STE 301
  • Empty string if no secondary addressing

City, State, ZIP:

  • Individual fields: city, state, zip_code
  • Combined: city_state_zipSAN FRANCISCO, CA 94107

API Reference

normalize_address(address: str) -> USPSAddress

Main function to normalize an address.

Parameters:

  • address (str): Raw address string in any format

Returns:

  • USPSAddress: Normalized address object

Raises:

  • ImportError: If libpostal is not installed

USPSAddress Class

Attributes:

  • line1 (str): Primary delivery address
  • line2 (str): Secondary address (or empty string)
  • city (str): City name in ALL CAPS
  • state (str): Two-letter state code
  • zip_code (str): ZIP code

Properties:

  • city_state_zip (str): Combined city, state, ZIP
  • full_address (str): All lines joined with newlines

Methods:

  • as_dict(): Returns dictionary with all components
  • __str__(): Returns full_address
  • __repr__(): Developer-friendly representation

Why Use This Library?

The Problem with Manual Parsing

Without this library, you'd need:

  • ❌ Complex regex patterns for each component
  • ❌ Manual lookup tables for all abbreviations
  • ❌ Logic to distinguish "FL" (Floor) from "FL" (Florida)
  • ❌ Handling multi-word street names
  • ❌ Context-aware parsing

The Solution

libpostal provides: Intelligent ML-based parsing (knows "FL 3" is a floor, not Florida) ✅ This library provides: Complete USPS normalization rules and formatting ✅ You get: Clean, standardized addresses ready for databases, mailings, or APIs

Requirements

  • Python >= 3.8
  • libpostal C library (brew install libpostal)
  • postal Python package (installed automatically)

Development

Building from Source

# Install development dependencies
pip install -e ".[dev]"

# Build wheel
python -m build

# Output: dist/usps_address_normalizer-1.0.0-py3-none-any.whl

Running Tests

cd tests
python test_normalizer.py

License

MIT License

Credits

Version

1.0.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sunny_address_normalization-1.0.0.tar.gz (10.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sunny_address_normalization-1.0.0-py3-none-any.whl (10.1 kB view details)

Uploaded Python 3

File details

Details for the file sunny_address_normalization-1.0.0.tar.gz.

File metadata

File hashes

Hashes for sunny_address_normalization-1.0.0.tar.gz
Algorithm Hash digest
SHA256 f8b554d4008ac560a3a59c9a57e4e32e31b53c0e64ac5434750e6ff8cee73fc3
MD5 3d97f0adbb7139c32424b69e85c8b351
BLAKE2b-256 89f3be9e4a0c27d75efcd47d7e039bd05eebcd46e102d2897b00be12e13c229f

See more details on using hashes here.

File details

Details for the file sunny_address_normalization-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for sunny_address_normalization-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 befb247ac6447942d3da3d88b3937f5329b024d102e55d91d85557c3417f52e1
MD5 c555cf9698daf23af1297fa33093b146
BLAKE2b-256 c2852fb3f64130ec394a3d5cdb675dc438ea1a8ca500de35d18135f0e388f3f3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page