High-performance Python library for Forest Inventory Analysis (FIA) data analysis
Project description
pyFIA
A high-performance Python library for analyzing USDA Forest Inventory and Analysis (FIA) data using modern data science tools.
Overview
pyFIA provides a programmatic API for working with Forest Inventory and Analysis (FIA) data. It leverages modern Python data science tools like Polars and DuckDB for efficient processing of large-scale national forest inventory datasets with statistically valid estimation methods.
Features
Core Estimation Functions
- ✅ Trees per acre (
tpa()) - Live and dead tree abundance - ✅ Biomass (
biomass()) - Above/belowground biomass and carbon - ✅ Volume (
volume()) - Merchantable volume (cubic feet) - ✅ Forest area (
area()) - Forest land area by category - ✅ Mortality (
mortality()) - Annual mortality rates - ✅ Growth (
growth()) - Net growth estimation
Statistical Methods
- Design-based estimation following Bechtold & Patterson (2005)
- Post-stratified estimation with proper variance calculation
- Temporally indifferent (TI) estimation matching EVALIDator default
- EVALID-based filtering for statistically valid estimates
- Ratio-of-means estimators for per-acre values
Performance Features
- DuckDB backend for efficient large-scale data processing
- Polars DataFrames for fast in-memory operations
- Lazy evaluation for memory-efficient workflows
- Parallel processing support
Installation
# Basic installation
pip install pyfia
# With spatial analysis support
pip install pyfia[spatial]
# For development
pip install -e .[dev]
Quick Start
from pyfia import FIA, biomass, tpa, volume, area
# Load FIA data and filter to a state
with FIA("path/to/FIA_database.duckdb") as db:
# Filter to state (required before estimation)
db.clip_by_state(37) # North Carolina
db.clip_most_recent(eval_type="EXPVOL")
# Get trees per acre (live trees on forestland)
tpa_results = tpa(db, tree_domain="STATUSCD == 1")
# Get biomass estimates
biomass_results = biomass(db, land_type="forest")
# Get forest area
area_results = area(db, land_type="forest")
# Get volume estimates
volume_results = volume(db, land_type="forest")
Domain Filtering and Grouping
pyFIA supports flexible domain filtering and grouping:
# Tree-level filtering (snake_case parameters)
tpa_live = tpa(db, tree_domain="STATUSCD == 1")
# Group by species
biomass_by_species = biomass(db, by_species=True)
# Area domain filtering
area_timberland = area(db, land_type="timber")
# Group by custom column
volume_by_owner = volume(db, grp_by="OWNGRPCD")
Data Organization
pyFIA follows FIA's evaluation-based data structure:
- EVALID: 6-digit codes identifying statistically valid plot groupings
- Evaluation types: EXPALL (area), EXPVOL (volume), EXPMORT (mortality), EXPGROW (growth)
- EVALID management: Use
db.clip_most_recent(eval_type="EXPVOL")for latest evaluations
Advanced Usage
# Context manager for automatic connection handling
with FIA("path/to/FIA_database.duckdb") as db:
# Filter to state and most recent evaluation
db.clip_by_state(37) # North Carolina
db.clip_most_recent(eval_type="EXPVOL")
# Biomass by species
results = biomass(db, by_species=True)
# Multiple estimations with same connection
tpa_results = tpa(db, tree_domain="STATUSCD == 1")
volume_results = volume(db, tree_domain="DIA >= 10.0")
area_results = area(db, land_type="timber")
Documentation
Full documentation available at https://mihiarc.github.io/pyfia/
Performance
pyFIA achieves excellent performance through modern database technologies:
- 10-100x faster for large-scale queries using DuckDB columnar storage
- 2-5x faster for in-memory operations using Polars DataFrames
- Statistically valid estimates following FIA methodology
Citation
If you use pyFIA in your research, please cite:
@software{pyfia2024,
title = {pyFIA: A Python Library for Forest Inventory Analysis},
author = {Mihiar, Chris},
year = {2024},
url = {https://github.com/mihiarc/pyfia}
}
License
MIT License - see LICENSE file for details.
Acknowledgments
- Uses USDA Forest Service FIA data
- Statistical methods from Bechtold & Patterson (2005):
- Bechtold, W.A.; Patterson, P.L., eds. 2005. The Enhanced Forest Inventory and Analysis Program - National Sampling Design and Estimation Procedures. Gen. Tech. Rep. SRS-80. Asheville, NC: U.S. Department of Agriculture, Forest Service, Southern Research Station. 85 p. https://doi.org/10.2737/SRS-GTR-80
- Key equations: Chapter 4 (pp. 53-77) - see Eq. 4.1 (domain indicator), Eq. 4.2 (adjustment factor), Eq. 4.8 (tree attributes), Section 4.2 (variance estimation)
- Inspired by various FIA analysis tools and methodologies in the forestry community
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyfia-1.0.0b1.tar.gz.
File metadata
- Download URL: pyfia-1.0.0b1.tar.gz
- Upload date:
- Size: 132.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9c83fbf44f7f3e166a6e9f1c58ee1f1bc43af996c9889ac788baab87cb2614c8
|
|
| MD5 |
adacdace065f932497c30deb424ffaea
|
|
| BLAKE2b-256 |
fc87262deeb8354a5c95d1e4cdb116cf1f98e93482bce7512e7e35f9b6e04d09
|
File details
Details for the file pyfia-1.0.0b1-py3-none-any.whl.
File metadata
- Download URL: pyfia-1.0.0b1-py3-none-any.whl
- Upload date:
- Size: 166.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6b758a1efeb2ff7febf13bae185629e8338dbe4360191f9c29a8c79f762d5ddc
|
|
| MD5 |
177818e053ca897a870c1b48bdfaee27
|
|
| BLAKE2b-256 |
3178d265931990db8f623f38a5e71e6d235ba00ca6925e8e797afa8250c6726c
|