Lightweight Python package for intuitively exploring and acquiring U.S. Census data with spatial integration

These details have not been verified by PyPI

Project description

pycen

Lightweight Python package for exploring and acquiring U.S. Census data with intuitive spatial integration.

flowchart TD
    A[Need Census data?]

    A --> B & C

    subgraph PYCEN["<i>pycen</i>"]
        direction TB
        B[<b>`explore`</b><br/>Intuitive metadata<br/>keyword search]
        C[<b>`acquire`</b><br/>Data + boundaries<br/>in one call]

        C --> D
        C --> E

        D[<b>`quick_check`</b><br/>Quality validation]
        E[<b>`quick_viz`</b><br/>Instant maps]
    end

    B --> F
    D & E --> F[Domain analysis]

    style A fill:#94a3b8,stroke:#334155,stroke-width:2px,color:#000
    style B fill:#3b82f6,stroke:#1e40af,stroke-width:2px,color:#fff
    style C fill:#3b82f6,stroke:#1e40af,stroke-width:2px,color:#fff
    style D fill:#22c55e,stroke:#15803d,stroke-width:2px,color:#fff
    style E fill:#22c55e,stroke:#15803d,stroke-width:2px,color:#fff
    style F fill:#94a3b8,stroke:#334155,stroke-width:2px,color:#000
    style PYCEN fill:#1e293b,stroke:#64748b,stroke-width:2px,color:#fff

overview

pycen makes the exploration and acquisition of U.S. Census data accessible and intuitive for spatial workflows. The explore module presents browsable Census API metadata via topic-organized, interactive nested tables, with customizable themes to highlight curated variable recipes. It also supports natural‑language keyword searches for efficient variable discovery. The acquire module streamlines data processing: one function call returns both data and boundaries as a GeoDataFrame with built-in quality checks and rapid visualizations;simple tabular or boundaries-only downloads are separately callable. pycen pulls live data products with efficient local caches to keep iterations fast, smooth, and reproducible. The multi‑year fetch function enables longitudinal comparisons tracking change over time.

sample use

basic workflow

import pycen
from pycen import explore, acquire

# 1. Explore variables
# `browse` and `search` return interactive tables
# `lookup` returns details
explore.browse(year=2023, dataset="acs5").show()
explore.search("vehicle", year=2023, dataset="acs5").show()
explore.lookup("B08201_002E", year=2021, dataset="acs5")

# 2. Acquire data
## continental US income gini map
gdf = acquire.get_censhp(
    variables={"B19083_001E":"gini_index"},
    geography="place",               # if no state/county, gets nationwide
    dataset="acs5",
    year=2023,
)
acquire.quick_check(gdf)             # returns N/A summary
acquire.quick_viz(gdf, "gini_index") # returns map + distribution histogram
acquire.quick_viz(gdf, "gini_index", palette="viridis") # optional customizable palette
acquire.quick_viz(gdf, "gini_index",save_path='gini_index.png') # optional save

## finer scale
## Cook County income gini at tract level
gdf = acquire.get_censhp(
    variables={"B19083_001E":"gini_index"},
    geography="tract",
    county="Cook County County",
    state="IL",
    dataset="acs5",
    year=2023,
)
acquire.quick_viz(gdf, "gini_index")

## neighborhood analyses
## Chicago super commuters
gdf = acquire.get_censhp(
    variables={"B08303_012E":"commute_over_60min", "B08303_001E":"total_commuters"},
    geography="block group",
    place="Chicago city",
    #county="Cook County",  # optional, add for clarity
    state="IL",
    dataset="acs5",
    year=2023
)
gdf["pct_super_commuters"] = gdf["commute_over_60min"] / gdf["total_commuters"] * 100
acquire.quick_viz(gdf, "pct_super_commuters")

## decennial data supports block-scale (finest)
## Chicago housing vacancy rates at block level
select_var={
    "H001003": "vacant_hh",
    "H001001": "total_hh"
}
gdf = acquire.get_censhp(
    variables=select_var,
    geography="block",
    county="Cook County",
    state="IL",
    dataset="dec_pl",
    year=2010,
)
gdf['vacancy_rate'] = gdf['vacant_hh'] / gdf['total_hh'] * 100
acquire.quick_viz(gdf, "vacancy_rate")

tabular data workflow

# 3. Tabular data only
df = acquire.get_census(
    variables=["B25032_022E"],  # renter-occupied, mobile home
    geography="tract",
    state="CA",
    year=2021,
)

# 4. Single-year, multivariable tabular data for comparative analysis
import pandas as pd
import matplotlib.pyplot as plt
from pycen import acquire

vars_race = {
    'B03002_001E': 'total',
    'B03002_003E': 'nh_white',
    'B03002_004E': 'nh_black',
    'B03002_006E': 'nh_asian',
    'B03002_005E': 'nh_aian',
    'B03002_007E': 'nh_nhpi',
    'B03002_008E': 'nh_other',
    'B03002_009E': 'nh_two_or_more',
    'B03002_012E': 'hispanic',
}

df_race = acquire.get_census(
    variables=vars_race,
    geography='county',
    state='CA',
    county='Alameda',
    dataset='acs5',
    year=2023,
)

row = df_race.iloc[0]
other = row['nh_aian'] + row['nh_nhpi'] + row['nh_other'] + row['nh_two_or_more']
vals = {
    'White (NH)': row['nh_white'],
    'Black (NH)': row['nh_black'],
    'Asian (NH)': row['nh_asian'],
    'Other (NH)': other,
    'Hispanic (any race)': row['hispanic'],
}

pct = {k: v / row['total'] * 100 for k, v in vals.items()}

plt.figure(figsize=(7, 4))
plt.bar(pct.keys(), pct.values(), color=['#4c78a8', '#f58518', '#54a24b', '#b279a2', '#e45756'])
plt.ylabel('Population %')
plt.title('Alameda County, CA -— Race/Ethnicity (ACS 2023)')
plt.xticks(rotation=25, ha='right')
plt.tight_layout()
plt.show()

# 5. Multi-year tabular data for trend analysis
# comparative tracking of remote work surge (2019–2023)
from pycen import acquire
import matplotlib.pyplot as plt

# explore.search("work from home", year=2023, dataset="acs5").show()
# B08101_049E = worked from home
df_long = acquire.get_census(
    variables={'B08101_049E': 'wfh_workers', 'B08101_001E': 'total_workers'},
    geography='county',
    state='CA',
    years=[2019, 2020, 2021, 2022, 2023],
    merge='long'
)

df_long['wfh_pct'] = (df_long['wfh_workers'] / df_long['total_workers']) * 100
bay_area = df_long[df_long['NAME'].str.contains('San Francisco|Alameda|Santa Clara|Contra Costa|San Mateo')]

for county in bay_area['NAME'].unique():
    county_data = bay_area[bay_area['NAME'] == county]
    plt.plot(county_data['year'], county_data['wfh_pct'], marker='o', label=county)

plt.title('Bay Area WFH 2019-2023')
plt.ylabel('Work From Home (%)')
plt.xlabel('Year')
plt.xticks(sorted(major['year'].unique())) 
plt.legend()
plt.grid(alpha=0.3)
plt.tight_layout()

core functions

Explore

explore.search(query, year, dataset) - supports exact term match and fuzzy keyword search
explore.browse(year, dataset) - view all variables via interactive tree table with theme variable highlights
explore.lookup(code, year, dataset) - inspect variable details

Acquire

acquire.get_censhp(...) - data + boundaries --> GeoDataFrame
acquire.get_census(...) - data only --> DataFrame
acquire.get_boundaries(...) - boundaries only --> shp/gpkg
acquire.quick_check(gdf) - N/A values summary
acquire.quick_viz(gdf, column, palette, save_path) - exploratory map + distribution histogram for select variable

Info

pycen.get_product() - list datasets and years
pycen.get_geography() - list geography levels by dataset

Geo Helpers

from pycen import geography
geography.search('Oakland', state='CA') # most powerful, return all related info

# state and county lookup
geography.state('CA') # can also search by 'California' or fips code '06'
geography.county('Alameda', state='CA')

# list geographies
geography.list_places('CA', query='Oakland') # minimal search
pycen.geography.list_cbsa(query='new york',year=2023, limit=5) # specify year and return limit if multi-match
pycen.geography.list_csa(query='detroit',year=2023, limit=5) # look up csa name
geography.list_counties('CA')

Themes

pycen.set_theme(name_or_dict) - set active theme name or register a custom theme (dict)
pycen.get_theme_settings() - get active theme name (defaults to a general curation of useful variables)
pycen.explore.get_theme(name=None) - get theme details (dict); defaults to active theme
pycen.list_themes() - list available theme names (includes session custom themes)

Notes

Datasets: acs5, acs1, dec_pl, dec_sf1
Spatial features require: geopandas, pygris
Geographies are resolved per dataset/year from Census geography metadata (live/cache/static)
Optional: rich enables prettier terminal tables for explore.search().show()
geography.search() uses a bundled 2020 snapshot by default; if a different vintage is requested, it attempts a live code-list fetch and falls back to 2020 if unavailable

API key for higher rate limits:

pycen.set_api_key("YOUR_KEY")  # get key at api.census.gov/data/key_signup.html

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0a4 pre-release

Feb 3, 2026

0.1.0a3 pre-release

Feb 2, 2026

0.1.0a2 pre-release

Jan 22, 2026

0.1.0a1 pre-release

Jan 20, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pycen-0.1.0a4.tar.gz (608.0 kB view details)

Uploaded Feb 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pycen-0.1.0a4-py3-none-any.whl (611.1 kB view details)

Uploaded Feb 3, 2026 Python 3

File details

Details for the file pycen-0.1.0a4.tar.gz.

File metadata

Download URL: pycen-0.1.0a4.tar.gz
Upload date: Feb 3, 2026
Size: 608.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for pycen-0.1.0a4.tar.gz
Algorithm	Hash digest
SHA256	`a10480ec541925a24029bfdf109e9a7ee6015f27d7a762f9a64a5c9268bf3893`
MD5	`e934eff5cff4824c630c7e5d30d5a01d`
BLAKE2b-256	`5f1d9ff03070395ed81158c2b947595e1bf3edef2197b6c0eb0823eccd795631`

See more details on using hashes here.

File details

Details for the file pycen-0.1.0a4-py3-none-any.whl.

File metadata

Download URL: pycen-0.1.0a4-py3-none-any.whl
Upload date: Feb 3, 2026
Size: 611.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for pycen-0.1.0a4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ff0453e9f3f5d36db86afc413b957752409b829387207a28e2f100e4b44e90dd`
MD5	`0b65175ffef58c9582a86773a1361c0d`
BLAKE2b-256	`63dbe904a94ac733522b67a06f3e3e56205352141b660718511242b67d3cf6ec`

See more details on using hashes here.

pycen 0.1.0a4

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

pycen

overview

sample use

basic workflow

tabular data workflow

core functions

Notes

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes