A package for Adaptive Spatio-Temporal Model (AdaSTEM) in python

These details have not been verified by PyPI

Project links

Homepage

Project description

stemflow :bird:

stemflow logo

A package for Adaptive Spatio-Temporal Model (AdaSTEM) in python.

GitHub PyPI version PyPI downloads GitHub last commit

Documentation :book:

stemflow Documentation

Installation :wrench:

pip install stemflow

Mini Test :test_tube:

To run a auto-mini test, call:

from stemflow.mini_test import run_mini_test

run_mini_test(delet_tmp_files=True)

Or, if the package were cloned from the github repo, you can run the python script:

git clone https://github.com/chenyangkang/stemflow.git
cd stemflow

pip install -r requirements.txt  # install dependencies

chmod 755 setup.py
python setup.py # installation

chmod 755 mini_test.py
python mini_test.py # run the test

See section Mini Test for further illustration of the mini test.

Brief introduction :information_source:

Stemflow is a toolkit for Adaptive Spatio-Temporal Exploratory Model (AdaSTEM [1,2]) in python. A typical usage is daily abundance estimation using eBird citizen science data. It leverages the "adjacency" information of surrounding target values in space and time, to predict the classes/continuous values of target spatial-temporal points. In the demo, we use a two-step hurdle model as "base model", with XGBoostClassifier for occurrence modeling and XGBoostRegressor for abundance modeling.

User can define the size of stixel (spatial temporal pixel) in terms of space and time. Larger stixel promotes generalizability but loses precision in fine resolution; Smaller stixel may have better predictability in the exact area but reduced extrapolability for points outside the stixel.

In the demo, we first split the training data using temporal sliding windows with size of 50 day of year (DOY) and step of 20 DOY (temporal_start = 1, temporal_end=366, temporal_step=20, temporal_bin_interval=50). For each temporal slice, a spatial gridding is applied, where we force the stixel to be split into smaller 1/4 pieces if the edge is larger than 25 units (measured in longitude and latitude, grid_len_lon_upper_threshold=25, grid_len_lat_upper_threshold=25), and stop splitting to prevent the edge length to shrink below 5 units (grid_len_lon_lower_threshold=5, grid_len_lat_lower_threshold=5) or containing less than 25 checklists (points_lower_threshold=50). Model fitting is run using 4 cores (njobs=4).

This process is excecuted 10 times (ensemble_fold = 10), each time with random jitter and random rotation of the gridding, generating 10 ensembles. In the prediciton phase, only spatial-temporal points with more than 7 (min_ensemble_required = 7) ensembles usable are predicted (otherwise, set as np.nan).

Usage :star:

from stemflow.model.AdaSTEM import AdaSTEM, AdaSTEMClassifier, AdaSTEMRegressor
from stemflow.model.Hurdle import Hurdle_for_AdaSTEM
from xgboost import XGBClassifier, XGBRegressor

SAVE_DIR = './'

# By using a hurdle model, we first excecute classification test based on presence/absence information, 
# then excecute regression only based on positive samples.

model = Hurdle_for_AdaSTEM(
    classifier=AdaSTEMClassifier(base_model=XGBClassifier(tree_method='hist',random_state=42, verbosity = 0, n_jobs=1),
                                save_gridding_plot = True,
                                ensemble_fold=10, 
                                min_ensemble_required=7,
                                grid_len_lon_upper_threshold=25,
                                grid_len_lon_lower_threshold=5,
                                grid_len_lat_upper_threshold=25,
                                grid_len_lat_lower_threshold=5,
                                points_lower_threshold=50,
                                Spatio1='longitude',
                                Spatio2 = 'latitude', 
                                Temporal1 = 'DOY',
                                use_temporal_to_train=True,
                                njobs=4),
    regressor=AdaSTEMRegressor(base_model=XGBRegressor(tree_method='hist',random_state=42, verbosity = 0, n_jobs=1),
                                save_gridding_plot = True,
                                ensemble_fold=10, 
                                min_ensemble_required=7,
                                grid_len_lon_upper_threshold=25,
                                grid_len_lon_lower_threshold=5,
                                grid_len_lat_upper_threshold=25,
                                grid_len_lat_lower_threshold=5,
                                points_lower_threshold=50,
                                Spatio1='longitude',
                                Spatio2 = 'latitude', 
                                Temporal1 = 'DOY',
                                use_temporal_to_train=True,
                                njobs=4)
)

Fitting and prediction methods follow the style of sklearn estimator class:

## fit
model.fit(X_train.reset_index(drop=True), y_train)

## predict
pred = model.predict(X_test)
pred = np.where(pred<0, 0, pred)
eval_metrics = AdaSTEM.eval_STEM_res('hurdle',y_test, pred_mean)
print(eval_metrics)

Where the pred is the mean of the predicted values across ensembles.

See AdaSTEM demo for further functionality.

Plot QuadTree ensembles :evergreen_tree:

model.classifier.gridding_plot
# or model.regressor.gridding_plot

QuadTree example

Here, each color shows an ensemble generated during model fitting. In each of the 10 ensembles, regions (in terms of space and time) with more training samples were gridded into finer resolution, while the sparse one remained coarse. Prediction results were aggregated across the ensembles (that is, in this example, data were gone through 10 times).

Example of visualization :world_map:

GIF visualization

See section Prediction and Visualization for how to generate this GIF.

Contribute to stemflow :purple_heart:

Pull requests are welcomed! Open a issue so that we can discuss the detailed implementation.

Application level cooperation is also welcomed! My domain knowledge is in avian ecology and evolution.

You can contact me at chenyangkang24@outlook.com

References:

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

1.1.6

Oct 31, 2025

1.1.5

Oct 13, 2025

1.1.4

Sep 2, 2025

1.1.3

Nov 20, 2024

1.1.2

Oct 27, 2024

1.1.1

Sep 21, 2024

1.1

Feb 22, 2024

1.0.9.7

Jan 31, 2024

1.0.9.6

Jan 29, 2024

1.0.9.5

Jan 26, 2024

1.0.9.4

Jan 11, 2024

1.0.9.3

Jan 10, 2024

1.0.9.2

Dec 29, 2023

1.0.9.1

Nov 5, 2023

1.0.9

Nov 5, 2023

1.0.8

Nov 4, 2023

1.0.7

Nov 1, 2023

1.0.6

Oct 19, 2023

1.0.5

Oct 18, 2023

1.0.4

Oct 18, 2023

1.0.3

Oct 18, 2023

1.0.2

Oct 1, 2023

1.0.1

Sep 21, 2023

1.0.0

Sep 21, 2023

0.0.28

Sep 20, 2023

0.0.27

Sep 20, 2023

0.0.26

Sep 20, 2023

0.0.25

Sep 20, 2023

0.0.24

Sep 18, 2023

0.0.23

Sep 18, 2023

0.0.22

Sep 14, 2023

0.0.20

Sep 14, 2023

This version

0.0.19

Sep 14, 2023

0.0.18

Sep 13, 2023

0.0.17

Sep 13, 2023

0.0.16

Sep 13, 2023

0.0.15

Sep 13, 2023

0.0.14

Sep 13, 2023

0.0.13

Sep 13, 2023

0.0.12

Sep 13, 2023

0.0.11

Sep 13, 2023

0.0.10

Sep 12, 2023

0.0.9

Sep 12, 2023

0.0.8

Sep 12, 2023

0.0.7

Sep 12, 2023

0.0.6

Sep 12, 2023

0.0.5

Sep 12, 2023

0.0.4

Sep 12, 2023

0.0.3

Sep 7, 2023

0.0.2

Sep 7, 2023

0.0.1

Sep 7, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stemflow-0.0.19.tar.gz (33.4 MB view details)

Uploaded Sep 14, 2023 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

stemflow-0.0.19-py3-none-any.whl (35.5 kB view details)

Uploaded Sep 14, 2023 Python 3

File details

Details for the file stemflow-0.0.19.tar.gz.

File metadata

Download URL: stemflow-0.0.19.tar.gz
Upload date: Sep 14, 2023
Size: 33.4 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.9.7

File hashes

Hashes for stemflow-0.0.19.tar.gz
Algorithm	Hash digest
SHA256	`c2892e2f380d49574cc788c084ff5101910a14ae055118970fe839fb78ae5706`
MD5	`4c57369344a8ba0604ab01e28c6c230f`
BLAKE2b-256	`1bc940f4238309e57176f255bc8f46bbb1d62a6f14903d7ae637bb93ae88fa5e`

See more details on using hashes here.

File details

Details for the file stemflow-0.0.19-py3-none-any.whl.

File metadata

Download URL: stemflow-0.0.19-py3-none-any.whl
Upload date: Sep 14, 2023
Size: 35.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.9.7

File hashes

Hashes for stemflow-0.0.19-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a125625b48b5ff74bbcce89dc17054daaa10828123a18fedfb4f7a75b6373b3c`
MD5	`58755dfc88fe03c47989cf598fd349fc`
BLAKE2b-256	`5508a1bf529cb7e33fcb5d9b5b7a32cbed792817a856c7b78006e5ba09a7746a`

See more details on using hashes here.

stemflow 0.0.19

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

stemflow :bird:

Documentation :book:

Installation :wrench:

Mini Test :test_tube:

Brief introduction :information_source:

Usage :star:

Plot QuadTree ensembles :evergreen_tree:

Example of visualization :world_map:

Contribute to stemflow :purple_heart:

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes