Lightweight Python package for intuitively exploring and acquiring U.S. Census data with spatial integration
Project description
pycen
Lightweight Python package for exploring and acquiring U.S. Census data with intuitive spatial integration.
flowchart TD
A[Need Census data?]
A --> B & C
subgraph PYCEN["<i>pycen</i>"]
direction TB
B[<b>`explore`</b><br/>Intuitive metadata<br/>keyword search]
C[<b>`acquire`</b><br/>Data + boundaries<br/>in one call]
C --> D
C --> E
D[<b>`quick_check`</b><br/>Quality validation]
E[<b>`quick_viz`</b><br/>Instant maps]
end
B --> F
D & E --> F[Domain analysis]
style A fill:#94a3b8,stroke:#334155,stroke-width:2px,color:#000
style B fill:#3b82f6,stroke:#1e40af,stroke-width:2px,color:#fff
style C fill:#3b82f6,stroke:#1e40af,stroke-width:2px,color:#fff
style D fill:#22c55e,stroke:#15803d,stroke-width:2px,color:#fff
style E fill:#22c55e,stroke:#15803d,stroke-width:2px,color:#fff
style F fill:#94a3b8,stroke:#334155,stroke-width:2px,color:#000
style PYCEN fill:#1e293b,stroke:#64748b,stroke-width:2px,color:#fff
overview
pycen makes the exploration and acquisition of U.S. Census data accessible and intuitive for spatial workflows. The explore module presents browsable Census API metadata via topic-organized, interactive nested tables, with customizable themes to highlight curated variable recipes. It also supports natural‑language keyword searches for efficient variable discovery. The acquire module streamlines data processing: one function call returns both data and boundaries as a GeoDataFrame with built-in quality checks and rapid visualizations;simple tabular or boundaries-only downloads are separately callable. pycen pulls live data products with efficient local caches to keep iterations fast, smooth, and reproducible. The multi‑year fetch function enables longitudinal comparisons tracking change over time.
sample use
# basic workflow
import pycen
from pycen import explore, acquire
# 1. Explore variables
# `browse` and `search` return interactive tables
# `lookup` returns details
explore.browse(year=2023, dataset="acs5").show()
explore.search("vehicle", year=2023, dataset="acs5").show()
explore.lookup("B08201_002E", year=2021, dataset="acs5")
# 2. Acquire data
## continental US income gini map
gdf = acquire.get_censhp(
variables={"B19083_001E":"gini_index"},
geography="place", # if no state/county, gets nationwide
dataset="acs5",
year=2023,
)
acquire.quick_check(gdf) # returns N/A summary
acquire.quick_viz(gdf, "gini_index") # returns map + distribution histogram
## finer scale
## Cook County income gini at tract level
gdf = acquire.get_censhp(
variables={"B19083_001E":"gini_index"},
geography="tract",
county="Cook County County",
state="IL",
dataset="acs5",
year=2023,
)
acquire.quick_viz(gdf, "gini_index")
## neighborhood analyses
## Chicago super commuters
gdf = acquire.get_censhp(
variables={"B08303_013E":"commute_over_60min", "B08303_001E":"total_commuters"},
geography="block group",
county="Cook County",
state="IL",
dataset="acs5",
year=2023,
clip_to="place", # default off, clip to [place/cbsa/csa]
place="Chicago city",
)
gdf["pct_super_commuters"] = gdf["commute_over_60min"] / gdf["total_commuters"] * 100
acquire.quick_viz(gdf, "pct_super_commuters")
## decennial data supports block-scale (finest)
## Chicago housing vacancy rates at block level
select_var={
"H001003": "vacant_hh",
"H001001": "total_hh"
}
gdf = acquire.get_censhp(
variables=select_var,
geography="block",
county="Cook County",
state="IL",
dataset="dec_pl",
year=2010,
)
gdf['vacancy_rate'] = gdf['vacant_hh'] / gdf['total_hh'] * 100
acquire.quick_viz(gdf, "vacancy_rate")
# 3. Tabular data only
df = acquire.get_census(
variables=["B25032_022E"], # renter-occupied, mobile home
geography="tract",
state="CA",
year=2021,
)
# 4. Multi-year tabular data for trend analysis
# comparative tracking of remote work surge (2019–2023)
from pycen import acquire
import matplotlib.pyplot as plt
# explore.search("work from home", year=2023, dataset="acs5").show()
# B08101_049E = worked from home
df_long = acquire.get_census(
variables={'B08101_049E': 'wfh_workers', 'B08101_001E': 'total_workers'},
geography='county',
state='CA',
years=[2019, 2020, 2021, 2022, 2023],
merge='long'
)
df_long['wfh_pct'] = (df_long['wfh_workers'] / df_long['total_workers']) * 100
bay_area = df_long[df_long['NAME'].str.contains('San Francisco|Alameda|Santa Clara|Contra Costa|San Mateo')]
for county in bay_area['NAME'].unique():
county_data = bay_area[bay_area['NAME'] == county]
plt.plot(county_data['year'], county_data['wfh_pct'], marker='o', label=county)
plt.title('Bay Area WFH 2019-2023')
plt.ylabel('Work From Home (%)')
plt.xlabel('Year')
plt.legend()
plt.grid(alpha=0.3)
plt.tight_layout()
core functions
Explore
explore.search(query, year, dataset)- supports exact term match and fuzzy keyword searchexplore.browse(year, dataset)- view all variables via interactive tree table with theme variable highlightsexplore.lookup(code, year, dataset)- inspect variable details
Acquire
acquire.get_censhp(...)- data + boundaries --> GeoDataFrameacquire.get_census(...)- data only --> DataFrameacquire.get_boundaries(...)- boundaries only --> shp/gpkgacquire.quick_check(gdf)- N/A values summaryacquire.quick_viz(gdf, column)- exploratory map + distribution histogram for select variable
Info
pycen.get_product()- list datasets and yearspycen.get_geography()- list geography levels by dataset
Themes
pycen.set_theme(name_or_dict)- set active theme name or register a custom theme (dict)pycen.get_theme_settings()- get active theme name (defaults to a general curation of useful variables)pycen.explore.get_theme(name=None)- get theme details (dict); defaults to active themepycen.list_themes()- list available theme names (includes session custom themes)
Notes
- Datasets:
acs5,acs1,dec_pl,dec_sf1 - Spatial features require:
geopandas,pygris - Geographies are resolved per dataset/year from Census geography metadata (live/cache/static)
API key for higher rate limits:
pycen.set_api_key("YOUR_KEY") # get key at api.census.gov/data/key_signup.html
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pycen-0.1.0a2.tar.gz.
File metadata
- Download URL: pycen-0.1.0a2.tar.gz
- Upload date:
- Size: 70.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6100ee026143c48a34eecad25672ae61263cd88284c0cfa3da1fd04953bffdb5
|
|
| MD5 |
068b6f99be6a92f1414428365847b1c9
|
|
| BLAKE2b-256 |
8813d6f8057c3268bc9a59d3c3cc89c8525728714a95e936f81d2cded9719d83
|
File details
Details for the file pycen-0.1.0a2-py3-none-any.whl.
File metadata
- Download URL: pycen-0.1.0a2-py3-none-any.whl
- Upload date:
- Size: 74.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
965773f95ae40d20ce5e193b648050f30f3b0ffc72fc4626ee94066679672f26
|
|
| MD5 |
2cc0d12ed46d1091f02015bee0a820eb
|
|
| BLAKE2b-256 |
d9a02350f5d346889868a55e57367db98782ce4a6e14d7f8f73968532c027394
|