Skip to main content

Tools for generating Parquet files from US Census 2020

Project description


Python tools for creating and maintaining Parquet files from US 2020 Census Data.


To use the data download shell script files first install wget.

To install the census-parquet package use

pip install census-parquet

This will also install the required Python dependencies which are:

  1. click
  2. dask
  3. dask_geopandas
  4. geopandas
  5. numpy
  6. openpyxl
  7. pandas
  8. pyarrow


To run the census-parquet code simply use


This runs the following scripts in order:

  1. - This script downloads the Census Boundary data needed to run
  2. - This script downloads population stat data needed for
  3. - This script downloads the Census Block data needed to run
  4. - This script processes the Census Boundary data and creates parquet files. The parquet files will be output into a boundary_outputs folder.
  5. - This script processes Census Block data and creates parquet files. The final combined parquet file will have the name tl_2020_FULL_tabblock20.parquet.

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for census-parquet, version 0.0.7
Filename, size File type Python version Upload date Hashes
Filename, size census_parquet-0.0.7-py3-none-any.whl (7.6 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size census-parquet-0.0.7.tar.gz (5.6 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page