Synthetic Microdata and Spatial MicroSimulation Modeling for ACS Data in Python
Project description
pysynthACS
Python version of the synthACS R package. Modernized with pandas, censusdis, xarray, and Rust.
pysynthACS is a high-performance Python library for generating synthetic populations from U.S. Census Bureau American Community Survey (ACS) data. It automates the process of fetching demographic data, cleaning it, and performing spatial microsimulation using a high-speed Rust-based simulated annealing engine.
Key Features
- Modern ACS Interface: Built on top of
censusdisfor efficient, reliable data fetching. - High Performance (Rust): Core optimization engine implemented in Rust using
PyO3. Provides a 100x-1000x speedup over the original R implementation through:- Delta-TAE Updates: Constant time $O(1)$ error calculation during swaps.
- Zero-cost Memory: In-place state updates without expensive data copying.
- Hybrid Annealing: Scale-agnostic fractional jumps and periodic temperature spikes for robust convergence.
- Multi-dimensional Data Cubes: Powered by
xarrayfor clean management of complex demographic data.- Semantic Selection: Access data by label (e.g.,
gender="male") instead of column indices. - Automatic Alignment: Seamlessly combine datasets with different dimensions.
- Vectorized Operations: Perform fast aggregations across specific demographic dimensions.
- Semantic Selection: Access data by label (e.g.,
- Immutable Data Structures: Utilizes frozen Python dataclasses for robust configuration and result management.
- Global Configuration: Easy API key management with
set_api_key(). - Comprehensive Testing & CI: Full suite of unit, integration, and performance tests with automated coverage reporting.
Installation
pysynthACS requires Python 3.13.
# Install from PyPI
uv pip install pysynthacs
#### Census API Key
- Requires an API key. Get yours [here](https://api.census.gov/data/key_signup.html)
## Quick Start
```python
import pysynthacs
from pysynthacs.core.generator import SyntheticGenerator
from pysynthacs.core.data import MicroData
# Set your Census API key
pysynthacs.set_api_key("YOUR_CENSUS_API_KEY")
# 1. Pull macro data for a specific geography
gen = SyntheticGenerator(year=2022)
macro = gen.pull_macro(geography={"state": "06", "county": "041"}) # Marin County, CA
# 2. Load your candidate pool (e.g. from PUMS)
pool_df = ... # Your pandas DataFrame with a 'category' column
micro = MicroData(data=pool_df)
# 3. Generate synthetic population
synthetic_pop = gen.generate(macro, micro, max_iter=50000)
print(synthetic_pop.head())
Examples
For detailed walkthroughs of pysynthACS functionality, see the examples/ directory:
| Example | Description |
|---|---|
01_basic_workflow.py |
Basic end-to-end flow: pulling county data, optimizing, and diagnostics. |
02_large_scale_optimization.py |
Large-scale generation across all census tracts in a county. |
03_attribute_augmentation.py |
Adding specialized conditional attributes (e.g. commute mode) to a population. |
04_demographic_simulation.py |
Stochastic simulation of vital events (births/deaths) over multiple iterations. |
Documentation & Testing
To run the tests:
# Run all tests
PYTHONPATH=src uv run pytest tests/
Citation
If you use pysynthACS in your research, please cite the following paper:
@Article{synthACS,
title = {{synthACS}: Spatial MicroSimulation Modeling with Synthetic {A}merican {C}ommunity {S}urvey Data},
author = {Alex Whitworth},
journal = {Journal of Statistical Software},
year = {2022},
volume = {104},
number = {7},
pages = {1--30},
doi = {10.18637/jss.v104.i07},
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pysynthacs-1.0.0.tar.gz.
File metadata
- Download URL: pysynthacs-1.0.0.tar.gz
- Upload date:
- Size: 147.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.17 {"installer":{"name":"uv","version":"0.9.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
24844dc48c0c74f388d1cdd5e6c8bf498677204e9d6d291769d7f6192bb9605a
|
|
| MD5 |
20cabd51e14d42c61c728265ac67902c
|
|
| BLAKE2b-256 |
214fba2bb7bf83b11033697596601efba933c6e494ee0bc469e0120aa7c81a82
|
File details
Details for the file pysynthacs-1.0.0-cp313-cp313-macosx_11_0_arm64.whl.
File metadata
- Download URL: pysynthacs-1.0.0-cp313-cp313-macosx_11_0_arm64.whl
- Upload date:
- Size: 267.5 kB
- Tags: CPython 3.13, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.17 {"installer":{"name":"uv","version":"0.9.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d55fcf60a4e947c960021925ba7252305a390ec8b5100212d63daf2d2e27830d
|
|
| MD5 |
81b19d3a27d1d8b2c9ca328d73b161bf
|
|
| BLAKE2b-256 |
ba16f0af8ee4eb12a8d0b625c57017301410e36446fecfd7c303f0e1ac42070e
|