Policy unit assignment for PolicyEngine's microdata stack
Project description
microunit
microunit is PolicyEngine's unit-assignment package for microdata.
It is part of the PolicyEngine microdata stack:
microimpute: fill missing variables and transfer attributes across data sources.microcalibrate, eventually maybemicroweight: align microdata to external targets.microunit: construct policy units from person and relationship records.microplex: synthesize, rebuild, and evaluate full microdata systems.
It separates "who belongs with whom" from benefit and tax formulas. The same person table can have several policy units layered on top of it:
- SPM units for poverty measurement.
- Tax units for filing and dependency rules.
- SNAP units for food assistance eligibility.
- Medicaid MAGI households, which are usually focal-person units rather than a single partition of the household.
The package starts with the common primitives those systems need:
UnitPartition: one unit ID per person, useful for SPM, tax, and many SNAP assignment outputs.EgoUnitMembership: one membership set per focal person, useful for MAGI-like rules where units can overlap.- SPM simplification adapters for programs whose true unit rules are not yet implemented.
- Diagnostics for comparing partitions within households.
- Conservative adapters that preserve existing unit IDs from source data.
Install
uv pip install -e ".[dev]"
Example
import pandas as pd
from microunit.units import assign_spm_partition
persons = pd.DataFrame(
{
"person_id": [1, 2, 3],
"household_id": [10, 10, 10],
"family_id": [100, 100, 101],
}
)
partition = assign_spm_partition(persons)
print(partition.to_frame())
Rules-based tax-unit construction
microunit includes the rules-based tax-unit / filing-status construction
engine extracted from
policyengine-us-data.
It applies federal filing and dependency rules to assign people into tax
units, infer each person's role (head / spouse / dependent), and infer a
filing status per unit. It is the same engine reused across the CPS and ACS
pipelines there, and is source-agnostic: it operates on
already-normalized, CPS-like person frames. It is consumed by
policyengine-us-data and microplex-us.
import pandas as pd
from microunit import construct_tax_units
# person uses CPS-like column names (see "Input contract" below).
person_assignments, tax_unit = construct_tax_units(person, year=2024)
construct_tax_units(person, year, mode="policyengine") returns:
person_assignments(indexed like the input):TAX_ID(int64, dense 1-based id),tax_unit_role_input(bytes:HEAD/SPOUSE/DEPENDENT),is_related_to_head_or_spouse(bool).tax_unit(one row perTAX_ID):filing_status_input(bytes:JOINT/HEAD_OF_HOUSEHOLD/SURVIVING_SPOUSE/SEPARATE/SINGLE).
The string columns are byte strings (the HDF5-friendly encoding used by the
source pipeline); decode with .decode().
A UnitPartition adapter is also provided:
from microunit.units import construct_tax_partition
partition = construct_tax_partition(person, year=2024) # UnitPartition(unit_type="tax")
Modes
"policyengine"(default,microunit.POLICYENGINE_MODE): PolicyEngine's dependency/filing-rule flow."census_documented"(microunit.CENSUS_DOCUMENTED_MODE): the publicly documented Census tax-model flow.
Input contract
Required CPS columns (raises KeyError if missing): PH_SEQ, A_LINENO,
A_AGE, A_MARITL, A_SPOUSE, PEPAR1, PEPAR2, A_EXPRRP.
Optional evidence columns (used when present, safely defaulted otherwise):
income components (WSAL_VAL, SEMP_VAL, FRSE_VAL, INT_VAL, DIV_VAL,
RNT_VAL, CAP_VAL, UC_VAL, OI_VAL, ANN_VAL, PNSN_VAL, SS_VAL),
total money income (PTOTVAL), enrollment (A_ENRLW, A_FTPT, A_HSCOL),
and disability flags (PEDISDRS, PEDISEAR, PEDISEYE, PEDISOUT,
PEDISPHY, PEDISREM). Relationship codes follow the CPS ASEC A_EXPRRP
recode, exposed as microunit.CPSRelationshipCode.
ACS column mapping is the consumer's responsibility
The ACS PUMS -> CPS column mapping (acs_to_cps_columns.py in
policyengine-us-data) is not part of microunit. That ~500-line module
is ACS-PUMS-specific (RELSHIPP/RELP translation, marital-status recoding,
and heuristic spouse/parent-pointer inference, since ACS provides no universal
spouse or parent pointers) and belongs with the ACS reader. Consumers reading
ACS should map their PUMS columns onto the CPS-like contract above and then
call construct_tax_units. Accordingly, the ACS-specific tests from
policyengine-us-data remain there; the full CPS construction test suite is
ported here.
Packaged data
The qualifying-relative gross income limit (the personal/dependent exemption
amount under IRC 151(d), used by the IRC 152(d)(1)(B) gross income test) ships
as package data at microunit/data/dependent_gross_income_limit.yaml and is
loaded via importlib.resources, so the engine does not depend on
policyengine-us being installed.
Scope
This package should construct unit assignments and explain them. It should not calculate benefits, taxes, or eligibility amounts. Policy engines remain responsible for program formulas.
Near-term roadmap:
- Move reusable SPM unit assignment out of
spm-calculator. - Move reusable tax-unit construction out of
policyengine-us-data/policyengine-us. (Done -- see "Rules-based tax-unit construction" above.) - Add CPS and ACS source adapters for Microplex.
- Use SPM units as the temporary simplification for SNAP, Medicaid/MAGI, and other program units.
- Replace those simplifications with real program rules once Microplex has a stable end-to-end unit pipeline.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file microunit-0.1.0.tar.gz.
File metadata
- Download URL: microunit-0.1.0.tar.gz
- Upload date:
- Size: 21.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a1e90f525e0a1a3921a3ed62ce291620bd45242f829cbd7892253dfff307eeb3
|
|
| MD5 |
1ce6e576c2e57fcdd0ff5387b9a1e25e
|
|
| BLAKE2b-256 |
58c16a8a1a1f7e90e41295e813808f170c71f0d20d36c6203722fd682d0a3387
|
File details
Details for the file microunit-0.1.0-py3-none-any.whl.
File metadata
- Download URL: microunit-0.1.0-py3-none-any.whl
- Upload date:
- Size: 23.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1652fd43b57fb6fc803089d0da0fc4d28948d9e7d5e742e3327afd376e0a3060
|
|
| MD5 |
5045502428bc9a76bdd1e69afd69e1c6
|
|
| BLAKE2b-256 |
f1cfa38de31d10b1029923daa7f9271a78c965deb94519627f6dd4d9c3fbf359
|