Skip to main content

Tools for generating Parquet files from US Census 2020

Project description

census-parquet

Python tools for creating and maintaining Parquet files from US 2020 Census Data.

Installation

To use the data download shell script files first install wget.

To install the census-parquet package use

pip install census-parquet

This will also install the required Python dependencies which are:

  1. click
  2. dask
  3. dask_geopandas
  4. geopandas
  5. numpy
  6. openpyxl
  7. pandas
  8. pyarrow

Usage

To run the census-parquet code simply use

run_census_parquet

This runs the following scripts in order:

  1. download_boundaries.sh - This script downloads the Census Boundary data needed to run process_boundaries.py
  2. download_population_stats.sh - This script downloads population stat data needed for process_blocks.py
  3. download_blocks.sh - This script downloads the Census Block data needed to run process_blocks.py
  4. process_boundaries.py - This script processes the Census Boundary data and creates parquet files. The parquet files will be output into a boundary_outputs folder.
  5. process_blocks.py - This script processes Census Block data and creates parquet files. The final combined parquet file will have the name tl_2020_FULL_tabblock20.parquet.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

census-parquet-0.0.9.tar.gz (6.5 kB view details)

Uploaded Source

Built Distribution

census_parquet-0.0.9-py3-none-any.whl (8.2 kB view details)

Uploaded Python 3

File details

Details for the file census-parquet-0.0.9.tar.gz.

File metadata

  • Download URL: census-parquet-0.0.9.tar.gz
  • Upload date:
  • Size: 6.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for census-parquet-0.0.9.tar.gz
Algorithm Hash digest
SHA256 ab4c7867ef23a223a01454c459bdccb1334f50b4f492f3b50386dab307113843
MD5 798d8df03b1ae23e48f1323fad717a6f
BLAKE2b-256 4d58cf8be64dfdd5afd32b10df873c2e3c64d62804cb63f8622637b066adeb14

See more details on using hashes here.

File details

Details for the file census_parquet-0.0.9-py3-none-any.whl.

File metadata

File hashes

Hashes for census_parquet-0.0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 859963ea6fbcce12da4da43d7ed72b8132fc21d5bbc3eeb50622da6e26fa0bf5
MD5 36fc71ba9a4b907e2f92498e86e5711b
BLAKE2b-256 6e3c4a6bdc58d99ed11b1f39611d54e936fe925a44f10411d80f8296288fd8c3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page