A general purpose stepshifting algorithm for tabular data, based on BaseEstimator.
Project description
StepShifter3 🛠️
A general purpose Python package for time series analysis of tabular data
StepShifter3 is a Python package designed to facilitate time series analysis of tabular data. It is developed and maintained by the Peace Research Institute Oslo (PRIO) as part of the VIEWS project.
📚 Table of Contents
🛠 Installation
To install StepShifter3, you have two options:
🚨 Recommended Branch: stable
For a more stable experience, we recommend using the stable branch rather than the main branch. The stable branch contains well-tested and production-ready code, while the main branch may contain work-in-progress or experimental features that could be unstable.
How to Switch to the stable branch:
Using Git CLI:
- For pip installation, clone the
stablebranch directly:git clone -b stable https://github.com/YourUsername/StepShifter3.git
- If you've already cloned the repository and are on the
mainbranch, switch tostablewith:git checkout stable
Using GitHub Web Interface:
- If you're downloading the code from the GitHub web interface, make sure to switch to the
stablebranch using the branch dropdown before downloading.
-
Using pip: 📦
pip install StepShifter3
-
From GitHub: 🐱💻
git clone https://github.com/YourUsername/StepShifter3.git cd StepShifter3 python setup.py install
📝 Usage
The Stepshifter class is the main class of the package. It handles all models which is herited from the sklearn BaseEstimator class.
Basic Usage with XGBRegressor and dummy data from synthetic data generator
from StepShifter3 import StepShifter, SyntheticDataGenerator
from xgboost import XGBRegressor
# Generates a pandas multiindex dataframe with dummy data Indexes: month_id, country_id
df_synthetic_small = SyntheticDataGenerator("loa", n_time=516, n_prio_grid_size=50, n_country_size= 242,n_features=15,use_dask=True).generate_dataframe()
# Initialize the StepShifter class with the XGBRegressor model, DaskClientManager and parameters
params_xgb_reg = {
'objective': 'reg:squarederror',
'n_estimators': 80,
'max_depth': 3,
'learning_rate': 0.1,
'gamma': 0,
'min_child_weight': 1,
'subsample': 1,
'eval_metric': 'rmse',
}
# Establish a connection to daskclientmanger
dask_client = DaskClientManager(is_local=True, n_workers=8, threads_per_worker=1, memory_limit="4.5GB", remote_addresses=None,asynchronous=False)
stepshifter_config_regression = { "target_column" : "ln_ged_sb_dep", # The target column in your training dataset
"ID_columns" : ["month_id", "priogrid_id"], # The ID columns in your training dataset
"time_column" : "month_id", # The time column in your training dataset
"run_name" : 'my_first_run', # The name of the run in mlflow, should be changes every time a new model type is run
"experiment_name" : 'ensemble_models', # The name of the experiment in mlflow
"mlflow_tracking_uri" : 'http://127.0.0.1:5000', # The uri of the mlflow server, if not set the default is localhost:5000 or 127.0.0.1:5000
"S": 36, # Number of steps ahead to predict
"metrics_report": True, # Not used at the moment
"fit_params":{}, # Parameter list to be passed to the fit method of the model
"dask_client": dask_client, # Dask Client
"is_dask": True, # Set True if using dataframes from dask
}
# Initialize stepshifter class
stepshifter = StepShifter(xgboost.dask.DaskXGBRegressor(**params_xgb_reg), stepshifter_config_regression)
# What part of the data should be validated
validation_range = [1, 516]
X, y, is_dask = stepshifter.validate_and_filter_data(df_synthetic, validation_range)
# Fit the model
tau_e_0 = 121
tau_e_t = 316
stepshifter.fit(X,y,tau_e_0,tau_e_t)
# Get predictions
X_pred = ...
tau_start = ...
tau_end = ...
stepshifter.predict(X_pred,tau_start,tau_end)
🤝 Contributing
Contributions are welcome! To contribute:
- Make an issue describing the feature you want to add or the bug you want to fix.
- Create your Feature Branch (
git checkout -b <issuenumber>-<your-feature-name>) - Commit your Changes (
git commit -m 'Add some AmazingFeature') - Push to the Branch (
git push origin <issuenumber>-<your-feature-name>) - Open a Pull Request
🐞 Common bugs
Using the wrong predict() function
An easy-to-make mistake is to use the wrong predict() function. Make sure to use the StepShifter predict() function by running predict() on the StepShifter object and not on the trained models.
Correct use of the StepShifter predict():
stepshifter.predict(X, tau_start, tau_end)
Incorrect use of the StepShifter predict(): stepshifter.models[<some_number_between_1_and_S>].predict(X, tau_start, tau_end)
📚 References
🔖 License
Distributed under the MIT License. See LICENSE for more information.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file stepshifter3-0.2.0b0.tar.gz.
File metadata
- Download URL: stepshifter3-0.2.0b0.tar.gz
- Upload date:
- Size: 20.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c8a976760fd8a5c4438bec45c0048377234b4a491779567ed4ef3b0f47413731
|
|
| MD5 |
ba2c3d46f0d74be1ecc6105c23dba87a
|
|
| BLAKE2b-256 |
3a12eca48faae276fba5ef1013681877e0b50a5711edb41cefd5aebe35793887
|
File details
Details for the file stepshifter3-0.2.0b0-py3-none-any.whl.
File metadata
- Download URL: stepshifter3-0.2.0b0-py3-none-any.whl
- Upload date:
- Size: 17.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
69b433d35eb418b56054c1185a2034b6a5ea1f837abde0fa62ca22783726dafd
|
|
| MD5 |
e35f28b147564a5caa752be6edab5609
|
|
| BLAKE2b-256 |
2f13ebd8b57647022c6c8aefde3b597627e473f2310756f38afbf39df71b38ee
|