Skip to main content

Explores time information to train a robust random forest

Project description

time-robust-forest

Build status Python Version Dependencies Status

Code style: black Security: bandit Pre-commit Semantic Versions License

A Proof of concept model that explores timestamp information to train a random forest with better Out of Distribution generalization power.

Installation

pip install -U time-robust-forest

How to use it

There are a classifier and a regressor under time_robust_forest.models. They follow the sklearn interface, which means you can quickly fit and use a model:

from time_robust_forest.models import TimeForestClassifier

features = ["x_1", "x_2"]
time_column = "periods"
target = "y"

model.fit(training_data[features + [time_column]], training_data[target])
predictions = model.predict_proba(test_data[features])[:, 1]

There are only a few arguments that differ from a traditional Random Forest. two arguments

  • time_column: the column from the input dataframe containing the time periods the model will iterate over to find the best splits (default: "period")
  • min_sample_periods: the number of examples in every period the model needs to keep while it splits.
  • period_criterion: how the performance in every period is going to be aggregated. Options: {"avg": average, "max": maximum, the worst case}. (default: "avg")

Make sure you have a good choice for the time column

Don't simply use a timestamp column from the dataset, make it discrete before and guarantee there is a reasonable amount of data points in every period. Example: use year if you have 3+ years of data. Notice the choice to make it discrete becomes a modeling choice you can optimize.

License

License

This project is licensed under the terms of the BSD-3 license. See LICENSE for more details.

Citation

@misc{time-robust-forest,
  author = {Moneda, Luis},
  title = {Time Robust Forest model},
  year = {2021},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/lgmoneda/time-robust-forest}}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

time-robust-forest-0.1.4.tar.gz (12.0 kB view hashes)

Uploaded Source

Built Distribution

time_robust_forest-0.1.4-py3-none-any.whl (11.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page