A package to work with IPUMS microdata with Polars.
Project description
polars_ipums
A package to work with IPUMS microata in Python. Used in-house at Opportunity Insights.
Example
# convert IPUMS microdata export to a hive-partioned Parquet dataset
import polars as pl
from polars_ipums import create_parquet_dataset
input_path = "~/Downloads/ipums_export"
output_path = "~/Desktop/parquet_ipums"
labels = {
# use the default IPUMS labels for the sex column
"sex": {},
# use custom labels for the race/hispanic origin column
"rachsing": {
"White": "White",
"Black/African American": "Black",
"American Indian/Alaska Native": "AIAN",
"Asian/Pacific Islander": "Asian",
"Hispanic/Latino": "Hispanic",
},
}
# give a few columns more human-readable names
renames = {
"rachsing": "race",
"countyfip": "county",
"ftotinc": "family_income",
"hhincome": "household_income",
}
# custom education column!
educd = pl.col("educd")
my_education = (
pl.when(educd.le(61))
.then(0)
.when(educd.is_between(62, 64))
.then(1)
.when(educd.is_between(65, 100))
.then(2)
.when(educd.is_between(101, 116))
.then(3)
.alias("my_education")
)
create_parquet_dataset(
input_path,
output_path,
labels=labels,
partition_by=["year"],
renames=renames,
additional_columns=[my_education],
override_output=True,
verbose=True,
)
# load a few rows back into memory
ipums_microdata = (
pl.scan_parquet(output_path / "**/*.parquet", hive_partitioning=True)
.head()
.collect()
)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
polars_ipums-0.0.1.tar.gz
(5.6 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file polars_ipums-0.0.1.tar.gz.
File metadata
- Download URL: polars_ipums-0.0.1.tar.gz
- Upload date:
- Size: 5.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.5.23
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
023364e25929a440fa88eb91855000b44bf5ae9e8a80c893c94f6dd3a0e892d2
|
|
| MD5 |
cb622617f4b3de3502627e6c2634f29a
|
|
| BLAKE2b-256 |
0dfbcff977372cc836f69cb20de96a0e2cf5a6e46a05ce7da3e96900bdc01704
|
File details
Details for the file polars_ipums-0.0.1-py3-none-any.whl.
File metadata
- Download URL: polars_ipums-0.0.1-py3-none-any.whl
- Upload date:
- Size: 6.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.5.23
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
78c3dd52d36bfa91be37771fb29103133e2a386a52d31e397ce1eecfa8a0c838
|
|
| MD5 |
010ec8b9a6e84321f5af5b3c0339531f
|
|
| BLAKE2b-256 |
42f7da44da173749206e205e19fb1bb01717f5ad67afca8e1ca76aa09a220ab7
|