A Python package to explore IPUMS data.
Project description
PyIPUMS
PyIPUMS is a library for working with data from IPUMS.
Example
Example that provides the IPUMS metadata in a dictionary.
import json
import pandas as pd
from src.pyipums.parse_xml import read_ipums_ddi
from ipumspy import readers, ddi
def read_ipums_micro(ddi, data_file_path, n_max=None):
# Read the fixed-width data file using the extracted column information
df = pd.read_fwf(
data_file_path,
dtypes=ddi["column_dtypes"],
colspecs=ddi["column_specs"],
header=None,
names=ddi["columns"],
nrows=n_max,
compression="gzip",
)
return df
def main():
ddi_file_path = "./usa_00003.xml"
data_file_path = "./usa_00003.dat.gz"
cps_ddi = read_ipums_ddi(ddi_file_path)
print(json.dumps(cps_ddi["file_metadata"], indent=2))
cps_data = read_ipums_micro(cps_ddi, data_file_path, n_max=100)
print(cps_data.head())
Modifying
If you are looking to make changes to the library I recommend using poetry.
poetry env use 3.8
pyenv shell 3.8
poetry shell
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pyipums-0.0.2.tar.gz
(3.5 kB
view details)
Built Distribution
File details
Details for the file pyipums-0.0.2.tar.gz
.
File metadata
- Download URL: pyipums-0.0.2.tar.gz
- Upload date:
- Size: 3.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5f870115d8b8fe854c16a27732f36ecaa5200db191799265717c5631be456871 |
|
MD5 | 98af8dfca807f6e1916154f5df18d019 |
|
BLAKE2b-256 | 072e35d9b3f199a1d571d2e868df26a53e757a125192cf24d940f95e9c5c0660 |
File details
Details for the file pyipums-0.0.2-py3-none-any.whl
.
File metadata
- Download URL: pyipums-0.0.2-py3-none-any.whl
- Upload date:
- Size: 4.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | dff8c8aecdbeb0ad9b4be8f8725b3c8edbbad1867c2b8dce1a689a494ac3beac |
|
MD5 | 32f3d4ac180afd29a4b7081994aaa762 |
|
BLAKE2b-256 | 50a99cd87408036b0862960f91823b7b9240fa050783b46de94301ae4cb7ba9a |