A package for Adaptive Spatio-Temporal Model (AdaSTEM) in python

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

stemflow

stemflow logo

A package for Adaptive Spatio-Temporal Model (AdaSTEM) in python.

GitHub PyPI version Anaconda version PyPI downloads GitHub last commit

Installation

pip install stemflow

Brief introduction

stemflow is a toolkit for Adaptive Spatio-Temporal Model (AdaSTEM) in python. A typical usage is daily abundance estimation using eBird citizen science data. It leverages the "adjacency" information of surrounding target values in space and time, to predict the classes/continues values of target spatial-temporal point. In the demo, we use a two-step hurdle model as "base model", with XGBoostClassifier for occurence modeling and XGBoostRegressor for abundance modeling.

User can define the size of stixel (spatial temporal pixel) in terms of space and time. Larger stixel guarantees generalizability but loses precision in fine resolution; Smaller stixel may have better predictability in the exact area but reduced extrapolability for points outside the stixel.

In the demo, we first split the training data using temporal sliding windows with size of 50 day of year (DOY) and step of 20 DOY (temporal_start = 1, temporal_end=366, temporal_step=20, temporal_bin_interval=50). For each temporal slice, a spatial gridding is applied, where we force the stixel to be split into smaller 1/4 pieces if the edge is larger than 25 units (measured in longitude and latitude, grid_len_lon_upper_threshold=25, grid_len_lat_upper_threshold=25), and stop splitting to prevent the edge length to shrink below 5 units (grid_len_lon_lower_threshold=5, grid_len_lat_lower_threshold=5) or containing less than 25 checklists (points_lower_threshold=50).

This process is excecuted 10 times (ensemble_fold = 10), each time with random jitter and random rotation of the gridding, generating 10 ensembles. In the prediciton phase, only spatial-temporal points with more than 7 (min_ensemble_required = 7) ensembles usable are predicted (otherwise, set as np.nan).

Fitting and prediction methods follow the convention of sklearn estimator class:

## fit
model.fit(X_train.reset_index(drop=True), y_train)

## predict
pred = model.predict(X_test)
pred = np.where(pred<0, 0, pred)

Where the pred is the mean of the predicted values across ensembles.

Usage

from stemflow.model.AdaSTEM import AdaSTEM, AdaSTEMClassifier, AdaSTEMRegressor
from stemflow.model.Hurdle import Hurdle_for_AdaSTEM
from xgboost import XGBClassifier, XGBRegressor

SAVE_DIR = './'


model = Hurdle_for_AdaSTEM(
    classifier=AdaSTEMClassifier(base_model=XGBClassifier(tree_method='hist',random_state=42, verbosity = 0, n_jobs=1),
                                save_gridding_plot = True,
                                ensemble_fold=10, 
                                min_ensemble_required=7,
                                grid_len_lon_upper_threshold=25,
                                grid_len_lon_lower_threshold=5,
                                grid_len_lat_upper_threshold=25,
                                grid_len_lat_lower_threshold=5,
                                points_lower_threshold=50,
                                Spatio1='longitude',
                                Spatio2 = 'latitude', 
                                Temporal1 = 'DOY',
                                use_temporal_to_train=True,
                                njobs=4),
    regressor=AdaSTEMRegressor(base_model=XGBRegressor(tree_method='hist',random_state=42, verbosity = 0, n_jobs=1),
                                save_gridding_plot = True,
                                ensemble_fold=10, 
                                min_ensemble_required=7,
                                grid_len_lon_upper_threshold=25,
                                grid_len_lon_lower_threshold=5,
                                grid_len_lat_upper_threshold=25,
                                grid_len_lat_lower_threshold=5,
                                points_lower_threshold=50,
                                Spatio1='longitude',
                                Spatio2 = 'latitude', 
                                Temporal1 = 'DOY',
                                use_temporal_to_train=True,
                                njobs=4)
)

## fit
model.fit(X_train.reset_index(drop=True), y_train)

## predict
pred = model.predict(X_test)
pred = np.where(pred<0, 0, pred)
eval_metrics = AdaSTEM.eval_STEM_res('hurdle',y_test, pred_mean)
print(eval_metrics)

Plot QuadTree ensembles

model.classifier.gridding_plot
# or model.regressor.gridding_plot

QuadTree example

Example of visualization

GIF visualization

Documentation

stemflow Documentation

References:

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

1.1

Feb 22, 2024

1.0.9.7

Jan 31, 2024

1.0.9.6

Jan 29, 2024

1.0.9.5

Jan 26, 2024

1.0.9.4

Jan 11, 2024

1.0.9.3

Jan 10, 2024

1.0.9.2

Dec 29, 2023

1.0.9.1

Nov 5, 2023

1.0.9

Nov 5, 2023

1.0.8

Nov 4, 2023

1.0.7

Nov 1, 2023

1.0.6

Oct 19, 2023

1.0.5

Oct 18, 2023

1.0.4

Oct 18, 2023

1.0.3

Oct 18, 2023

1.0.2

Oct 1, 2023

1.0.1

Sep 21, 2023

1.0.0

Sep 21, 2023

0.0.28

Sep 20, 2023

0.0.27

Sep 20, 2023

0.0.26

Sep 20, 2023

0.0.25

Sep 20, 2023

0.0.24

Sep 18, 2023

0.0.23

Sep 18, 2023

0.0.22

Sep 14, 2023

0.0.20

Sep 14, 2023

0.0.19

Sep 14, 2023

0.0.18

Sep 13, 2023

0.0.17

Sep 13, 2023

0.0.16

Sep 13, 2023

0.0.15

Sep 13, 2023

0.0.14

Sep 13, 2023

0.0.13

Sep 13, 2023

0.0.12

Sep 13, 2023

0.0.11

Sep 13, 2023

0.0.10

Sep 12, 2023

0.0.9

Sep 12, 2023

This version

0.0.8

Sep 12, 2023

0.0.7

Sep 12, 2023

0.0.6

Sep 12, 2023

0.0.5

Sep 12, 2023

0.0.4

Sep 12, 2023

0.0.3

Sep 7, 2023

0.0.2

Sep 7, 2023

0.0.1

Sep 7, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stemflow-0.0.8.tar.gz (57.1 MB view hashes)

Uploaded Sep 12, 2023 Source

Built Distribution

stemflow-0.0.8-py3-none-any.whl (34.2 kB view hashes)

Uploaded Sep 12, 2023 Python 3

Hashes for stemflow-0.0.8.tar.gz

Hashes for stemflow-0.0.8.tar.gz
Algorithm	Hash digest
SHA256	`194af6f1247875eb7e0eadbec327f29e29d569484337cc45e003b7a565315814`
MD5	`e7b6673cb76f1653dc2a66475066b855`
BLAKE2b-256	`f43e848f64b2e488feab1a7f4da2582579ec62865b288eeeadb17f331cb47433`

Hashes for stemflow-0.0.8-py3-none-any.whl

Hashes for stemflow-0.0.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`cd95862ec49fd222dd0c52a518013ca892509cb6bf316b5fdd6e183a2bde0124`
MD5	`6d049901fe7a731ce256e4a24a55eb1c`
BLAKE2b-256	`e0161c1c582a3f99992d0ae1af37ec1e6668e79938793d8aa808f4047c7b2d02`