Offline address-to-Census data mapping for Python with PL 94-171 and ACS support
Project description
census-lookup
A Python library for mapping US addresses to Census data locally, without relying on rate-limited APIs. Supports Census 2020 (PL 94-171) and American Community Survey (ACS) 5-Year Estimates.
Features
- Fully offline geocoding using TIGER Address Range files (~95% match rate)
- Lazy per-state data downloading - only download data for states you need
- Census data at ALL geographic levels - block, block group, tract, county, and state in a single lookup
- Two Census data sources:
- PL 94-171 (Redistricting Data): Population, race, housing counts at block level
- ACS 5-Year Estimates: Income, education, employment, housing characteristics at tract level
- Efficient batch processing for large address lists
- CLI and Python API - use from command line or in your code
Installation
# Using uv (recommended)
uv add census-lookup
# Using pip
pip install census-lookup
Quick Start
CLI (no install required)
# Look up a single address (auto-downloads data as needed)
uvx census-lookup lookup "123 Main St, Los Angeles, CA 90012"
# Include specific census variables
uvx census-lookup lookup "123 Main St, Los Angeles, CA 90012" -v P1_001N -v H1_001N
# Process a batch file (use -l to set output level for CSV columns)
uvx census-lookup batch input.csv output.csv --address-column addr -l tract
# Pre-download data for states (optional - data downloads automatically)
uvx census-lookup download CA TX NY
# List available census variables
uvx census-lookup variables
# Show cache info
uvx census-lookup info
Example Output
$ uvx census-lookup lookup "1600 Pennsylvania Avenue NW, Washington, DC 20500" -v P1_001N
{
"input_address": "1600 Pennsylvania Avenue NW, Washington, DC 20500",
"matched_address": "Pennsylvania Ave NW",
"latitude": 38.898761,
"longitude": -77.035117,
"match_type": "interpolated",
"match_score": 0.9,
"state_fips": "11",
"county_fips": "11001",
"tract": "11001010100",
"block_group": "110010101003",
"block": "110010101003014",
"P1_001N": {
"block": 19.0,
"block_group": 963.0,
"tract": 2699.0,
"county": 689545.0,
"state": 689545.0
}
}
Census data is returned at all geographic levels in a single lookup. Each variable contains values aggregated at block, block group, tract, county, and state levels.
With ACS variables (median income, home value):
$ uvx census-lookup lookup "1600 Pennsylvania Avenue NW, Washington, DC 20500" \
-v B19013_001E -v B25077_001E
{
"...": "...",
"B19013_001E": {
"tract": 72500.0
},
"B25077_001E": {
"tract": 485000.0
}
}
ACS variables are available at tract level and above.
Python API
from census_lookup import CensusLookup
# Initialize (first use will download data for the state)
lookup = CensusLookup(
variables=["P1_001N", "H1_001N"], # Population, Housing units
)
# Single address lookup
result = await lookup.geocode("123 Main St, Los Angeles, CA 90012")
print(f"Block GEOID: {result.block}")
print(f"Block Population: {result.census_data['P1_001N']['block']}")
print(f"Tract Population: {result.census_data['P1_001N']['tract']}")
# Batch processing
import pandas as pd
df = pd.read_csv("addresses.csv")
results = await lookup.geocode_batch(df["address"], progress=True)
Geographic Levels
| Level | GEOID Length | Example |
|---|---|---|
| State | 2 | 06 |
| County | 5 | 06037 |
| Tract | 11 | 06037210100 |
| Block Group | 12 | 060372101001 |
| Block | 15 | 060372101001023 |
Census Variables
PL 94-171 (Redistricting Data)
Available at block level and above. Includes:
- P1: Race (total population, by race categories)
- P2: Hispanic/Latino by Race
- P3: Race for Population 18+ (voting age)
- P4: Hispanic/Latino 18+
- H1: Housing Units (total, occupied, vacant)
# Use variable groups
lookup = CensusLookup(variable_groups=["population", "housing"])
# Or specify individual variables
lookup = CensusLookup(variables=["P1_001N", "P1_003N", "H1_001N"])
ACS 5-Year Estimates (American Community Survey)
Available at tract level and above. Includes richer demographic data:
| Category | Key Variables | Description |
|---|---|---|
| Income | B19013_001E, B19301_001E |
Median household income, per capita income |
| Poverty | B17001_001E, B17001_002E |
Total population, below poverty level |
| Education | B15003_022E, B15003_023E |
Bachelor's degree, Master's degree |
| Employment | B23025_004E, B23025_005E |
Employed, Unemployed |
| Housing | B25077_001E, B25064_001E |
Median home value, median rent |
| Tenure | B25003_002E, B25003_003E |
Owner-occupied, Renter-occupied |
| Health | B27010_017E, B27010_050E |
Employer insurance, Medicare |
| Commute | B08301_003E, B08301_010E |
Drove alone, Public transit |
| Internet | B28002_004E, B28002_013E |
Broadband access, No internet |
| Language | B16001_002E, B16001_003E |
English only, Spanish |
Over 100+ ACS variables available. Run uvx census-lookup variables --acs for the full list
from census_lookup import CensusLookup, list_acs_variable_groups
# See available ACS variable groups
print(list_acs_variable_groups())
# Use ACS variables with your lookup
lookup = CensusLookup(
variables=["P1_001N"], # PL 94-171 population
acs_variables=["B19013_001E", "B25077_001E"], # Median income, home value
# Or use variable groups:
# acs_variable_groups=["income", "housing"],
)
result = await lookup.geocode("123 Main St, Los Angeles, CA 90012")
# PL 94-171 data available at all levels
print(f"Block Population: {result.census_data['P1_001N']['block']}")
# ACS data available at tract level
print(f"Median Income: ${result.census_data['B19013_001E']['tract']:,}")
Note: ACS data is available at tract level and above. When you request ACS variables,
they will appear in the nested output with tract (and higher) levels populated.
Data Storage
Data is cached in ~/.census-lookup/:
~/.census-lookup/
├── catalog.json # Tracks downloaded data
├── tiger/
│ ├── addrfeat/ # Address range features
│ └── blocks/ # Block polygons
└── census/
├── pl94171/ # PL 94-171 data
└── acs5/ # ACS 5-Year data
└── tract/ # ACS at tract level
Typical storage per state: 100-300MB (TIGER + PL 94-171), plus ~10-50MB for ACS
How It Works
- Parse address using the
usaddresslibrary - Normalize street name for TIGER matching
- Match to TIGER Address Range segment
- Interpolate coordinates along the street segment
- Spatial lookup using rtree index to find containing census block
- Join census data using DuckDB for efficient queries
Data Sources
All data is downloaded from official US Census Bureau sources:
-
TIGER/Line Shapefiles: Geographic boundaries and address ranges
- https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html
- Address Range Feature files (ADDRFEAT) for geocoding
- Block shapefiles for spatial lookups
-
PL 94-171 Redistricting Data: Population and housing counts
- https://www.census.gov/programs-surveys/decennial-census/about/rdo/summary-files.html
- Available at block level and above
-
American Community Survey (ACS) 5-Year Estimates: Socioeconomic data
- https://www.census.gov/programs-surveys/acs
- Available at tract level and above
- Accessed via Census API: https://api.census.gov
Development
# Clone and install with uv
git clone https://github.com/yolodex-ai/census-lookup.git
cd census-lookup
uv sync --all-extras
# Run unit tests (fast, no network required)
uv run pytest tests/unit -v
# Run functional tests (downloads real data, slower)
uv run pytest tests/functional -v -s
# Run all tests
uv run pytest tests/ -v
# Run linting
uv run ruff check src/
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file census_lookup-0.2.3.tar.gz.
File metadata
- Download URL: census_lookup-0.2.3.tar.gz
- Upload date:
- Size: 51.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
181a126548a6882d29e985e06623a0587c332f5736b0d8a47ae2db668d3651ba
|
|
| MD5 |
52aed7774adf7a4ec5b332d6d118e481
|
|
| BLAKE2b-256 |
b857c62ca286adae2aaa5ef3397f8ae4b6b2bb2995736a96d8a8b26c300f3093
|
Provenance
The following attestation bundles were made for census_lookup-0.2.3.tar.gz:
Publisher:
publish.yml on yolodex-ai/census-lookup
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
census_lookup-0.2.3.tar.gz -
Subject digest:
181a126548a6882d29e985e06623a0587c332f5736b0d8a47ae2db668d3651ba - Sigstore transparency entry: 773036676
- Sigstore integration time:
-
Permalink:
yolodex-ai/census-lookup@9d391e7029dcd536540a9ed5678b49028ef77db8 -
Branch / Tag:
refs/tags/v0.2.3 - Owner: https://github.com/yolodex-ai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@9d391e7029dcd536540a9ed5678b49028ef77db8 -
Trigger Event:
release
-
Statement type:
File details
Details for the file census_lookup-0.2.3-py3-none-any.whl.
File metadata
- Download URL: census_lookup-0.2.3-py3-none-any.whl
- Upload date:
- Size: 55.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
685582ac2ab1100edd0bec64c01eaa1cb6c40e4b50ffea838a3ae76d5d460b02
|
|
| MD5 |
a8b05213a9e2d12aee7a37ea5d40986f
|
|
| BLAKE2b-256 |
a285cbbc57a8fd66c174b979c3df3e7b9f9d5bb35c411fa4fb43098e0ab648b1
|
Provenance
The following attestation bundles were made for census_lookup-0.2.3-py3-none-any.whl:
Publisher:
publish.yml on yolodex-ai/census-lookup
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
census_lookup-0.2.3-py3-none-any.whl -
Subject digest:
685582ac2ab1100edd0bec64c01eaa1cb6c40e4b50ffea838a3ae76d5d460b02 - Sigstore transparency entry: 773036757
- Sigstore integration time:
-
Permalink:
yolodex-ai/census-lookup@9d391e7029dcd536540a9ed5678b49028ef77db8 -
Branch / Tag:
refs/tags/v0.2.3 - Owner: https://github.com/yolodex-ai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@9d391e7029dcd536540a9ed5678b49028ef77db8 -
Trigger Event:
release
-
Statement type: