Skip to main content

US Census utilities for a variety of data loading and mapping purposes.

Project description

censusdis

Hippocratic License HL3-CL-ECO-EXTR-FFD-LAW-MIL-SV PyPI PyPI - Python Version

PyPI - Status PyPI - Format PyPI - Downloads

GitHub last commit Tests Badge Coverage Badge Documentation Status

Click any of the thumbnails below to see the notebook that generated it.

Integration in SoMa Tracts Diversity in New Jersey 2020 Median Income by County in Georgia Nationwide Integration at the Census Tract over Block Group Level White Alone Population as a Percent of County Population Average Age by Public Use Microdata Area in Massachusetts

Tutorial (A Great Place to Start!)

If you are interested in a tutorial, please see the censusdis-tutorial repository. This tutorial was presented at PyData Seattle 2023. If you want to try it out for yourself, the README.md contains links that let you run the tutorial notebooks live on mybinder.org in your browser without needing to set up a local development environment or download or install any code.

Introduction

censusdis is a package for discovering, loading, analyzing, and computing diversity, integration, and segregation metrics to U.S. Census demographic data. It is designed to be intuitive and Pythonic, but give users access to the full collection of data and maps the US Census publishes via their APIs. It also avoids hard-coding metadata about U.S. Census variables, such as their names, types, and hierarchies in groups. Instead, it queries this from the U.S. Census API. This allows it to operate over a large set of datasets and years, likely including many that don't exist as of time of this writing. It also integrates downloading and merging the geometry of geographic geometries to make plotting data and derived metrics simple and easy. Finally, it interacts with the divintseg package to compute diversity and integration metrics.

The design goal of censusdis are discussed in more detail in design-goals.md.

I'm not sure I get it. Show me what it can do.

The Nationwide Diversity and Integration notebook demonstrates how we can download, process, and plot a large amount of US Census demographic data quickly and easily to produce compelling results with just a few lines of code.

I'm sold! I want to dive right in!

To get straight to installing and trying out code hop over to our Getting Started guide.

censusdis lets you quickly and easily load US Census data and make plots like this one:

Median income by block group in GA

We downloaded the data behind this plot, including the geometry of all the block groups, with a single call:

import censusdis.data as ced
from censusdis.states import STATE_GA

# This is a census variable for median household income.
# See https://api.census.gov/data/2020/acs/acs5/variables/B19013_001E.html
MEDIAN_HOUSEHOLD_INCOME_VARIABLE = "B19013_001E"

gdf_bg = ced.download(
    "acs/acs5",  # The American Community Survey 5-Year Data
    2020,
    ["NAME", MEDIAN_HOUSEHOLD_INCOME_VARIABLE],
    state=STATE_GA,
    block_group="*",
    with_geometry=True
)

Similarly, we can download data and geographies, do a little analysis on our own using familiar Pandas data frame operations, and plot graphs like these

Percent of population identifying as white by county Integration is SoMa

Modules

The public modules that make up the censusdis package are

Module Description
censusdis.geography Code for managing geography hierarchies in which census data is organized.
censusdis.data Code for fetching data from the US Census API, including managing datasets, groups, and variable hierarchies.
censusdis.maps Code for downloading map data from the US, caching it locally, and using it to render maps.
censusdis.states Constants defining the US States. Used by the three other modules.

Demonstration Notebooks

There are several demonstration notebooks available to illustrate how censusdis can be used. They are found in the notebook directory of the source code.

The demo notebooks include

Notebook Name Description
ACS Comparison Profile.ipynb Load and plot American Community Survey (ACS) Comparison Profile data at the state level.
ACS Data Profile.ipynb Load and plot American Community Survey (ACS) Data Profile data at the state level.
ACS Demo.ipynb Load American Community Survey (ACS) Detail Table data for New Jersey and plot diversity statewide at the census block group level.
ACS Subject Table.ipynb Load and plot American Community Survey (ACS) Subject Table data at the state level.
Block Groups in CBSAs.ipynb Load and spatially join on-spine and off-spine geographies and plot the results on a map.
Data With Geometry.ipynb Load American Community Survey (ACS) data for New Jersey and plot diversity statewide at the census block group level.
Exploring Variables.ipynb Load metatdata on a group of variables, visualize the tree hierarchy of variables in the group, and load data from the leaves of the tree.
Getting Started Examples.ipynb Sample code from the Getting Started guide.
Nationwide Diversity and Integration.ipynb Load nationwide demographic data, compute diversity and integration, and plot.
Map Demo.ipynb Demonstrate loading at plotting maps of New Jersey at different geographic granularity.
Map Geographies.ipynb Illustrates a large number of different map geogpraphies and how to load them.
Population Change 2020-2021.ipynb Track the change in state population from 2020 to 2021 using ACS5 data.
PUMS Demo.ipynb Load Public-Use Microdata Samples (PUMS) data for Massachusetts and plot it.
Querying Available Data Sets.ipynb Query all available data sets. A starting point for moving beyond ACS.
Seeing White.ipynb Load nationwide demographic data at the county level and plot of map of the US showing the percent of the population who identify as white only (no other race) at the county level.
SoMa DIS Demo.ipynb Load race and ethnicity data for two towns in Essex County, NJ and compute diversity and integration metrics.
Time Series School District Poverty.ipynb Demonstrates how to work with time series datasets, which are a little different than vintaged data sets.

Diversity and Integration Metrics

Diversity and integration metrics from the divintseg package are demonstrated in some notebooks.

For more information on these metrics see the divintseg project.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

censusdis-0.99.0.tar.gz (62.6 kB view hashes)

Uploaded Source

Built Distribution

censusdis-0.99.0-py3-none-any.whl (64.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page