Tools for generating Parquet files from US Census 2020
Project description
census-parquet
Python tools for creating and maintaining Parquet files from US 2020 Census Data.
Installation
To use the data download shell script files first install wget.
To install the census-parquet package use
pip install census-parquet
This will also install the required Python dependencies which are:
Usage
To run the census-parquet code simply use
run_census_parquet
This runs the following scripts in order:
download_boundaries.sh
- This script downloads the Census Boundary data needed to runprocess_boundaries.py
download_population_stats.sh
- This script downloads population stat data needed for process_blocks.pydownload_blocks.sh
- This script downloads the Census Block data needed to run process_blocks.pyprocess_boundaries.py
- This script processes the Census Boundary data and creates parquet files. The parquet files will be output into aboundary_outputs
folder.process_blocks.py
- This script processes Census Block data and creates parquet files. The final combined parquet file will have the nametl_2020_FULL_tabblock20.parquet
.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
census-parquet-0.0.8.tar.gz
(6.2 kB
view hashes)
Built Distribution
Close
Hashes for census_parquet-0.0.8-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 95048943a876d9b6edc5e9c6e3e98a713b10916dc73044af3df44d98810ac83d |
|
MD5 | 281b233b6a1c57ade67ee988fe6a8acf |
|
BLAKE2b-256 | 3878beb0a5e576d5f1e4b76ba02c07c4d526f80cd17b4b1a2a3c1a85a5e517bc |