Skip to main content

Synthetic Microdata and Spatial MicroSimulation Modeling for ACS Data in Python

Project description

pysynthACS

Python version of the synthACS R package. Modernized with pandas, censusdis, xarray, and Rust.

pysynthACS is a high-performance Python library for generating synthetic populations from U.S. Census Bureau American Community Survey (ACS) data. It automates the process of fetching demographic data, cleaning it, and performing spatial microsimulation using a high-speed Rust-based simulated annealing engine.

Key Features

  • Modern ACS Interface: Built on top of censusdis for efficient, reliable data fetching.
  • High Performance (Rust): Core optimization engine implemented in Rust using PyO3. Provides a 100x-1000x speedup over the original R implementation through:
    • Delta-TAE Updates: Constant time $O(1)$ error calculation during swaps.
    • Zero-cost Memory: In-place state updates without expensive data copying.
    • Hybrid Annealing: Scale-agnostic fractional jumps and periodic temperature spikes for robust convergence.
  • Multi-dimensional Data Cubes: Powered by xarray for clean management of complex demographic data.
    • Semantic Selection: Access data by label (e.g., gender="male") instead of column indices.
    • Automatic Alignment: Seamlessly combine datasets with different dimensions.
    • Vectorized Operations: Perform fast aggregations across specific demographic dimensions.
  • Immutable Data Structures: Utilizes frozen Python dataclasses for robust configuration and result management.
  • Global Configuration: Easy API key management with set_api_key().
  • Comprehensive Testing & CI: Full suite of unit, integration, and performance tests with automated coverage reporting.

Installation

pysynthACS requires Python 3.13.

# Install from PyPI
uv pip install pysynthacs

#### Census API Key

- Requires an API key. Get yours [here](https://api.census.gov/data/key_signup.html)

## Quick Start

```python
import pysynthacs
from pysynthacs.core.generator import SyntheticGenerator
from pysynthacs.core.data import MicroData

# Set your Census API key
pysynthacs.set_api_key("YOUR_CENSUS_API_KEY")

# 1. Pull macro data for a specific geography
gen = SyntheticGenerator(year=2022)
macro = gen.pull_macro(geography={"state": "06", "county": "041"}) # Marin County, CA

# 2. Load your candidate pool (e.g. from PUMS)
pool_df = ... # Your pandas DataFrame with a 'category' column
micro = MicroData(data=pool_df)

# 3. Generate synthetic population
synthetic_pop = gen.generate(macro, micro, max_iter=50000)
print(synthetic_pop.head())

Examples

For detailed walkthroughs of pysynthACS functionality, see the examples/ directory:

Example Description
01_basic_workflow.py Basic end-to-end flow: pulling county data, optimizing, and diagnostics.
02_large_scale_optimization.py Large-scale generation across all census tracts in a county.
03_attribute_augmentation.py Adding specialized conditional attributes (e.g. commute mode) to a population.
04_demographic_simulation.py Stochastic simulation of vital events (births/deaths) over multiple iterations.

Documentation & Testing

To run the tests:

# Run all tests
PYTHONPATH=src uv run pytest tests/

Citation

If you use pysynthACS in your research, please cite the following paper:

@Article{synthACS,
    title = {{synthACS}: Spatial MicroSimulation Modeling with Synthetic {A}merican {C}ommunity {S}urvey Data},
    author = {Alex Whitworth},
    journal = {Journal of Statistical Software},
    year = {2022},
    volume = {104},
    number = {7},
    pages = {1--30},
    doi = {10.18637/jss.v104.i07},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pysynthacs-1.0.0.tar.gz (147.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pysynthacs-1.0.0-cp313-cp313-macosx_11_0_arm64.whl (267.5 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

File details

Details for the file pysynthacs-1.0.0.tar.gz.

File metadata

  • Download URL: pysynthacs-1.0.0.tar.gz
  • Upload date:
  • Size: 147.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.17 {"installer":{"name":"uv","version":"0.9.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for pysynthacs-1.0.0.tar.gz
Algorithm Hash digest
SHA256 24844dc48c0c74f388d1cdd5e6c8bf498677204e9d6d291769d7f6192bb9605a
MD5 20cabd51e14d42c61c728265ac67902c
BLAKE2b-256 214fba2bb7bf83b11033697596601efba933c6e494ee0bc469e0120aa7c81a82

See more details on using hashes here.

File details

Details for the file pysynthacs-1.0.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

  • Download URL: pysynthacs-1.0.0-cp313-cp313-macosx_11_0_arm64.whl
  • Upload date:
  • Size: 267.5 kB
  • Tags: CPython 3.13, macOS 11.0+ ARM64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.17 {"installer":{"name":"uv","version":"0.9.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for pysynthacs-1.0.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d55fcf60a4e947c960021925ba7252305a390ec8b5100212d63daf2d2e27830d
MD5 81b19d3a27d1d8b2c9ca328d73b161bf
BLAKE2b-256 ba16f0af8ee4eb12a8d0b625c57017301410e36446fecfd7c303f0e1ac42070e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page