Fast Area-weighted Spatial ReAggregation Tool - Compute area-weighted intersection weights between shapefile geometries and raster pixels
Project description
FASRAT
Fast Area-weighted Spatial ReAggregation Tool
FASRAT is a Python command-line tool for computing area-weighted intersection weights between shapefile geometries (e.g., census tracts, counties, or other polygons) and raster pixels. This is particularly useful for spatially aggregating raster data (such as climate data, satellite imagery, or other gridded datasets) to polygon boundaries.
Features
- 🗺️ Compute precise area-weighted intersections between vector polygons and raster grids
- 🚀 Fast processing with progress bars for large datasets
- 🎯 Automatically filters to contiguous US states (excludes Alaska, Hawaii, Puerto Rico)
- 💾 Outputs weight matrices in HDF5 format for efficient storage and reuse
- 🔧 Simple command-line interface with clear parameter validation
Installation
Option 1: Install from PyPI (Recommended)
Once published to PyPI, you can install FASRAT using pip:
pip install fasrat
Option 2: Install from Source
Using pip
# Clone or download the repository
cd /path/to/FASRAT
# Install in development mode
pip install -e .
# Or install normally
pip install .
Using uv (Recommended for development)
FASRAT supports uv for Python environment management:
# Install uv if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh
# Navigate to the FASRAT directory
cd /path/to/FASRAT
# Install the package with uv
uv pip install -e .
This will install FASRAT and all its dependencies, and make the fasrat command available in your environment.
Option 3: Install from GitHub
pip install git+https://github.com/njw0709/fasrat.git
Usage
Command-Line Interface
FASRAT provides two main commands:
1. Computing Weights
First, compute the area-weighted intersection weights between your shapefile and raster grid:
fasrat weights --shapefile <SHAPEFILE_PATH> --raster <RASTER_FILE> --output <OUTPUT_FILE>
Parameters:
--shapefileor-s: Path to your shapefile (.shp file)--rasteror-r: Path to a sample raster file (any format supported by rasterio)--outputor-o: Full path for the output parquet file (e.g.,/path/to/weights.parquet)--crsor-c: Optional CRS string (e.g., 'EPSG:4326') to project the shapefile to
Example:
fasrat weights --shapefile ../shapefiles/us_tract_2010/US_tract_2010.shp \
--raster /data/climate/tmmx_2010.nc \
--output ./output/tract_weights.parquet
2. Converting Raster Data
Apply the pre-computed weights to raster data for spatial averaging:
fasrat convert --weights <WEIGHTS_FILE> --raster <RASTER_FILE> --output <OUTPUT_FILE>
Parameters:
--weightsor-w: Path to the weights parquet file (from the weights command)--rasteror-r: Path to the raster file to process (any format supported by rasterio)--outputor-o: Path for the output file (CSV or parquet)--geoid-color-g: Geometry ID column name (auto-detects if not specified)--formator-f: Output format ('csv' or 'parquet', default is 'csv')--longor-l: Output time-series data in long format (default is wide format)
Example:
# Convert raster data to tract-level averages
fasrat convert --weights ./output/tract_weights.parquet \
--raster ./data/pm25_2010.nc \
--output ./output/pm25_tract_2010.csv
# With long format for time-series data
fasrat convert --weights ./output/tract_weights.parquet \
--raster ./data/pm25_2010.nc \
--output ./output/pm25_tract_2010.csv \
--long
# Output as parquet
fasrat convert --weights ./output/tract_weights.parquet \
--raster ./data/pm25_2010.nc \
--output ./output/pm25_tract_2010.parquet \
--format parquet
Getting Help
fasrat --help
fasrat weights --help
fasrat convert --help
Using FASRAT Programmatically
In addition to the command-line interface, you can use FASRAT as a Python library in your own scripts:
from fasrat import compute_raster_weights, apply_raster_weights
# Step 1: Compute weights
compute_raster_weights(
shapefile_path="./shapefiles/us_tract_2010/US_tract_2010.shp",
raster_path="./data/tmmx_2010.nc",
output_path="./output/tract_weights.parquet"
)
# Step 2: Apply weights to raster data
apply_raster_weights(
weights_path="./output/tract_weights.parquet",
raster_path="./data/pm25_2010.nc",
output_path="./output/pm25_tract_2010.csv",
output_format="csv",
long_format=False
)
This is useful when you want to integrate FASRAT into a larger data processing pipeline or automate batch processing.
Input File Formats
Shapefile
Provide the path to the .shp file. The shapefile should be a standard ESRI Shapefile format with associated files in the same directory:
.shp- the main geometry file (this is what you provide to the CLI).shx- shape index file.dbf- attribute database file.prj- projection information (recommended)
If your shapefile includes a state FIPS code column (e.g., STATEFP10, STATEFP, STATE_FIPS), the tool will automatically filter to contiguous US states.
Raster File
The raster file can be in any format supported by rasterio (NetCDF .nc, GeoTIFF .tif, etc.). The tool uses this file to:
- Determine the coordinate reference system (CRS) for spatial alignment
- Extract pixel resolution and dimensions
- Compute the intersection weights between polygons and pixels
- Read and aggregate raster data values
For multi-band rasters, each band is treated as a time step. Single-band rasters are treated as single-time data.
Output Format
Weights File
FASRAT outputs a parquet file containing a pandas DataFrame with the following columns:
raster_bbox_coords: Bounding box coordinates in raster index space for each polygonweight: A NumPy array (weight matrix) representing the area-weighted intersection between the polygon and each overlapping raster pixel. Weights sum to 1.0 for each polygon.area: The total area of each polygon (in the raster's CRS units)bounds: The geographic bounding box of each polygonGEOID10(or similar): The identifier from the original shapefile (if available)
Converted Data File
The convert command outputs aggregated data in CSV or parquet format:
Single-band rasters:
- Rows = geometry IDs
- Columns = geometry ID column and 'value'
Multi-band rasters (wide format, default):
- Rows = time steps (band indices)
- Columns = geometry IDs
Multi-band rasters (long format, with --long flag):
- Rows = geometry ID × time combinations
- Columns = 'time', geometry ID column, 'value'
How It Works
- Load Shapefile: Reads the vector polygon data
- Filter Geometries: Filters to contiguous US states (if state FIPS column exists)
- CRS Alignment: Reprojects polygons to match the raster's coordinate system
- Bounding Box Computation: Calculates the raster pixel indices that overlap each polygon
- Weight Matrix Calculation: For each polygon, computes the area-weighted intersection with each overlapping raster pixel
- Normalization: Ensures weights sum to 1.0 for each polygon
- Output: Saves the weight matrices to HDF5 format for efficient reuse
Requirements
- Python >= 3.9
- geopandas >= 1.0.1
- rasterio >= 1.4.3
- pandas >= 2.3.3
- numpy >= 2.0.2
- shapely >= 2.0.7
- tqdm >= 4.67.1
- click >= 8.1.7
- pyarrow (for parquet support)
License
See the LICENSE file for details.
Building and Publishing
Building the Package
To build the package for distribution:
# Install build tools
pip install build
# Build the package
python -m build
This will create both .tar.gz (source distribution) and .whl (wheel) files in the dist/ directory.
Publishing to PyPI
# Install twine
pip install twine
# Upload to TestPyPI first (recommended)
python -m twine upload --repository testpypi dist/*
# Upload to PyPI
python -m twine upload dist/*
Contributing
Contributions are welcome! Please feel free to submit issues or pull requests.
Citation
If you use FASRAT in your research, please cite it appropriately.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fasrat-1.4.1.tar.gz.
File metadata
- Download URL: fasrat-1.4.1.tar.gz
- Upload date:
- Size: 188.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c164fd1ca965e050c84b16e72d2ec227d3aa9b46013caf6ce717131bdad51ab3
|
|
| MD5 |
fbd5f6ee2e37d2c897adf664c2320f5c
|
|
| BLAKE2b-256 |
b3c53132654c32b4885cfd7936b0d653f9cc13476c4e52ef7b2f1fec347e8a27
|
Provenance
The following attestation bundles were made for fasrat-1.4.1.tar.gz:
Publisher:
publish.yml on njw0709/FASRAT
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fasrat-1.4.1.tar.gz -
Subject digest:
c164fd1ca965e050c84b16e72d2ec227d3aa9b46013caf6ce717131bdad51ab3 - Sigstore transparency entry: 1051761382
- Sigstore integration time:
-
Permalink:
njw0709/FASRAT@c1ee03d024f5206cd0555b7b5d224e37fc012720 -
Branch / Tag:
refs/tags/v1.4.1 - Owner: https://github.com/njw0709
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@c1ee03d024f5206cd0555b7b5d224e37fc012720 -
Trigger Event:
push
-
Statement type:
File details
Details for the file fasrat-1.4.1-py3-none-any.whl.
File metadata
- Download URL: fasrat-1.4.1-py3-none-any.whl
- Upload date:
- Size: 14.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
59407dc5e041afe64ddd4df3b51022f9276fe61860e272a78621135bc123c714
|
|
| MD5 |
ee0f7b35dc84ee350cd70471d2df0d2f
|
|
| BLAKE2b-256 |
2049218010c4c46216238add60e888ca5298f5eaf250393f3d22ad76866c3456
|
Provenance
The following attestation bundles were made for fasrat-1.4.1-py3-none-any.whl:
Publisher:
publish.yml on njw0709/FASRAT
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fasrat-1.4.1-py3-none-any.whl -
Subject digest:
59407dc5e041afe64ddd4df3b51022f9276fe61860e272a78621135bc123c714 - Sigstore transparency entry: 1051761385
- Sigstore integration time:
-
Permalink:
njw0709/FASRAT@c1ee03d024f5206cd0555b7b5d224e37fc012720 -
Branch / Tag:
refs/tags/v1.4.1 - Owner: https://github.com/njw0709
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@c1ee03d024f5206cd0555b7b5d224e37fc012720 -
Trigger Event:
push
-
Statement type: