Skip to main content

Tools for generating Parquet files from US Census 2020

Project description


Python tools for creating and maintaining Parquet files from US 2020 Census Data.


To use the data download shell script files first install wget.

To install the census-parquet package use

pip install census-parquet

This will also install the required Python dependencies which are:

  1. click
  2. dask
  3. dask_geopandas
  4. geopandas
  5. numpy
  6. openpyxl
  7. pandas
  8. pyarrow


To run the census-parquet code simply use


This runs the following scripts in order:

  1. - This script downloads the Census Boundary data needed to run
  2. - This script downloads population stat data needed for
  3. - This script downloads the Census Block data needed to run
  4. - This script processes the Census Boundary data and creates parquet files. The parquet files will be output into a boundary_outputs folder.
  5. - This script processes Census Block data and creates parquet files. The final combined parquet file will have the name tl_2020_FULL_tabblock20.parquet.

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

census-parquet-0.0.9.tar.gz (6.5 kB view hashes)

Uploaded source

Built Distribution

census_parquet-0.0.9-py3-none-any.whl (8.2 kB view hashes)

Uploaded py3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page