US address parsing and normalization library using libpostal
Project description
USPS Address Normalizer
A Python library for parsing and normalizing US addresses to USPS standard format using libpostal's intelligent parsing.
Features
✅ Intelligent Parsing - Uses libpostal ML model to understand address components ✅ USPS Compliant - Applies all USPS Publication 28 standardization rules ✅ Complete Abbreviations - Street suffixes, directionals, secondary units, states ✅ Separate Components - Returns city, state, and ZIP as individual fields ✅ Proper Line Formatting - Separates primary (Line 1) and secondary (Line 2) addressing ✅ Clean Output - ALL CAPS, no periods, normalized spacing
Installation
Prerequisites
Install libpostal C library:
brew install libpostal
Install the Package
# From wheel file
pip install usps_address_normalizer-1.0.0-py3-none-any.whl
# Or from source
pip install .
Usage
Basic Usage
from usps_address_normalizer import normalize_address
# Normalize any address format
address = normalize_address("128 king street floor 3 suite 301 san francisco california 94107")
# Access individual components
print(address.line1) # 128 KING ST
print(address.line2) # FL 3 STE 301
print(address.city) # SAN FRANCISCO
print(address.state) # CA
print(address.zip_code) # 94107
Computed Properties
# Get combined city, state, ZIP
print(address.city_state_zip)
# Output: SAN FRANCISCO, CA 94107
# Get full formatted address
print(address.full_address)
# Output:
# 128 KING ST
# FL 3 STE 301
# SAN FRANCISCO, CA 94107
Dictionary Export
# Convert to dictionary
data = address.as_dict()
print(data)
# Output:
# {
# 'line1': '128 KING ST',
# 'line2': 'FL 3 STE 301',
# 'city': 'SAN FRANCISCO',
# 'state': 'CA',
# 'zip_code': '94107',
# 'city_state_zip': 'SAN FRANCISCO, CA 94107',
# 'full_address': '128 KING ST\nFL 3 STE 301\nSAN FRANCISCO, CA 94107'
# }
Examples
Simple Address (No Unit)
address = normalize_address("100 Market Street San Francisco CA 94105")
print(address.line1) # 100 MARKET ST
print(address.line2) # (empty string)
print(address.city) # SAN FRANCISCO
print(address.state) # CA
print(address.zip_code) # 94105
Complex Address with Multiple Units
address = normalize_address("128 N King St, Floor 3, Suite 301, San Francisco, CA 94107")
print(address.line1) # 128 N KING ST
print(address.line2) # FL 3 STE 301
print(address.city) # SAN FRANCISCO
print(address.state) # CA
print(address.zip_code) # 94107
With Full State Name
address = normalize_address("456 Main Avenue Apartment 2B New York New York 10001")
print(address.line1) # 456 MAIN AVE
print(address.line2) # APT 2B
print(address.city) # NEW YORK
print(address.state) # NY (automatically converted)
print(address.zip_code) # 10001
PO Box
address = normalize_address("PO Box 1234 San Francisco CA 94107")
print(address.line1) # PO BOX 1234
print(address.line2) # (empty string)
print(address.city) # SAN FRANCISCO
print(address.state) # CA
print(address.zip_code) # 94107
USPS Standardization Rules
The library applies all official USPS abbreviations:
Street Suffixes
- STREET → ST
- AVENUE → AVE
- BOULEVARD → BLVD
- DRIVE → DR
- And 60+ more...
Directionals
- NORTH → N
- SOUTH → S
- NORTHEAST → NE
- And all 8 directionals...
Secondary Units
- SUITE → STE
- APARTMENT → APT
- FLOOR → FL
- BUILDING → BLDG
- And 20+ more...
States
- CALIFORNIA → CA
- NEW YORK → NY
- All 50 states + territories
Formatting Rules
- ALL CAPS for all address fields
- Remove periods (P.O. → PO)
- Remove extra spaces
- Standardize abbreviations
Address Line Format
Line 1 (Primary Address):
- Format:
[Number] [PreDir] [Street Name] [Suffix] [PostDir] - Example:
128 N KING ST
Line 2 (Secondary Address):
- Format:
[Unit Type] [Unit Number] - Example:
FL 3 STE 301 - Empty string if no secondary addressing
City, State, ZIP:
- Individual fields:
city,state,zip_code - Combined:
city_state_zip→SAN FRANCISCO, CA 94107
API Reference
normalize_address(address: str) -> USPSAddress
Main function to normalize an address.
Parameters:
address(str): Raw address string in any format
Returns:
USPSAddress: Normalized address object
Raises:
ImportError: If libpostal is not installed
USPSAddress Class
Attributes:
line1(str): Primary delivery addressline2(str): Secondary address (or empty string)city(str): City name in ALL CAPSstate(str): Two-letter state codezip_code(str): ZIP code
Properties:
city_state_zip(str): Combined city, state, ZIPfull_address(str): All lines joined with newlines
Methods:
as_dict(): Returns dictionary with all components__str__(): Returns full_address__repr__(): Developer-friendly representation
Why Use This Library?
The Problem with Manual Parsing
Without this library, you'd need:
- ❌ Complex regex patterns for each component
- ❌ Manual lookup tables for all abbreviations
- ❌ Logic to distinguish "FL" (Floor) from "FL" (Florida)
- ❌ Handling multi-word street names
- ❌ Context-aware parsing
The Solution
✅ libpostal provides: Intelligent ML-based parsing (knows "FL 3" is a floor, not Florida) ✅ This library provides: Complete USPS normalization rules and formatting ✅ You get: Clean, standardized addresses ready for databases, mailings, or APIs
Requirements
- Python >= 3.8
- libpostal C library (brew install libpostal)
- postal Python package (installed automatically)
Development
Building from Source
# Install development dependencies
pip install -e ".[dev]"
# Build wheel
python -m build
# Output: dist/usps_address_normalizer-1.0.0-py3-none-any.whl
Running Tests
cd tests
python test_normalizer.py
License
MIT License
Credits
- Built on libpostal for intelligent address parsing
- USPS abbreviations from USPS Publication 28
Version
1.0.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sunny_address_normalization-1.0.0.tar.gz.
File metadata
- Download URL: sunny_address_normalization-1.0.0.tar.gz
- Upload date:
- Size: 10.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f8b554d4008ac560a3a59c9a57e4e32e31b53c0e64ac5434750e6ff8cee73fc3
|
|
| MD5 |
3d97f0adbb7139c32424b69e85c8b351
|
|
| BLAKE2b-256 |
89f3be9e4a0c27d75efcd47d7e039bd05eebcd46e102d2897b00be12e13c229f
|
File details
Details for the file sunny_address_normalization-1.0.0-py3-none-any.whl.
File metadata
- Download URL: sunny_address_normalization-1.0.0-py3-none-any.whl
- Upload date:
- Size: 10.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
befb247ac6447942d3da3d88b3937f5329b024d102e55d91d85557c3417f52e1
|
|
| MD5 |
c555cf9698daf23af1297fa33093b146
|
|
| BLAKE2b-256 |
c2852fb3f64130ec394a3d5cdb675dc438ea1a8ca500de35d18135f0e388f3f3
|