Skip to main content

Tools for generating Parquet files from US Census 2020

Project description

census-parquet

Python tools for creating and maintaining Parquet files from US 2020 Census Data.

Installation

To use the data download shell script files first install wget.

To install the census-parquet package use

pip install census-parquet

This will also install the required Python dependencies which are:

  1. click
  2. dask
  3. dask_geopandas
  4. geopandas
  5. numpy
  6. openpyxl
  7. pandas
  8. pyarrow

Usage

To run the census-parquet code simply use

run_census_parquet

This runs the following scripts in order:

  1. download_boundaries.sh - This script downloads the Census Boundary data needed to run boundary_processing.py
  2. download_population_stats.sh - This script downloads population stat data needed for process_blocks.py
  3. download_blocks.sh - This script downloads the Census Block data needed to run process_blocks.py
  4. boundary_processing.py - This script processes the Census Boundary data and creates parquet files. The parquet files will be output into a boundary_outputs folder.
  5. process_blocks.py - This script processes Census Block data and creates parquet files. The final combined parquet file will have the name tl_2020_FULL_tabblock20.parquet.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for census-parquet, version 0.0.7
Filename, size File type Python version Upload date Hashes
Filename, size census_parquet-0.0.7-py3-none-any.whl (7.6 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size census-parquet-0.0.7.tar.gz (5.6 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page