Automated Time Series Forecasting

These details have not been verified by PyPI

Project links

Homepage

Project description

AutoTS

Forecasting Model Selection for Multiple Time Series

AutoML for forecasting with open-source time series implementations.

For other time series needs, check out the list here.

Features

Finds optimal time series forecasting model and data transformations by genetic programming optimization
Handles univariate and multivariate/parallel time series
Point and probabilistic upper/lower bound forecasts for all models
Over twenty available model classes, with tens of thousands of possible hyperparameter configurations
- Includes naive, statistical, machine learning, and deep learning models
- Multiprocessing for univariate models for scalability on multivariate datasets
- Ability to add external regressors
Over thirty time series specific data transformations
- Ability to handle messy data by learning optimal NaN imputation and outlier removal
Allows automatic ensembling of best models
- 'horizontal' ensembling on multivariate series - learning the best model for each series
Multiple cross validation options
- 'seasonal' validation allows forecasts to be optimized for the seasonity of the data
Subsetting and weighting to improve speed and relevance of search on large datasets
- 'constraint' parameter can be used to assure forecasts don't drift beyond historic boundaries
Option to use one or a combination of metrics for model selection
Import and export of model templates for deployment and greater user customization

Installation

pip install autots

This includes dependencies for basic models, but additonal packages are required for some models and methods.

Basic Use

Input data is expected to come in either a long or a wide format:

The wide format is a pandas.DataFrame with a pandas.DatetimeIndex and each column a distinct series.
The long format has three columns:
- Date (ideally already in pd.DateTime format)
- Series ID. For a single time series, series_id can be = None.
- Value
For long data, the column name for each of these is passed to .fit() as date_col, id_col, and value_col. No parameters are needed for wide data.

# also load: _hourly, _monthly, _weekly, _yearly, or _live_daily
from autots import AutoTS, load_daily

# sample datasets can be used in either of the long or wide import shapes
long = False
df = load_daily(long=long)

model = AutoTS(
    forecast_length=21,
    frequency='infer',
    prediction_interval=0.9,
    ensemble=None,
    model_list="default",
    transformer_list="fast",
    drop_most_recent=1,
    max_generations=4,
    num_validations=2,
    validation_method="backwards"
)
model = model.fit(
    df,
    date_col='datetime' if long else None,
    value_col='value' if long else None,
    id_col='series_id' if long else None,
)

prediction = model.predict()
# plot a sample
prediction.plot(model.df_wide_numeric,
                series=model.df_wide_numeric.columns[0],
                start_date="2019-01-01")
# Print the details of the best model
print(model)

# point forecasts dataframe
forecasts_df = prediction.forecast
# upper and lower forecasts
forecasts_up, forecasts_low = prediction.upper_forecast, prediction.lower_forecast

# accuracy of all tried model results
model_results = model.results()
# and aggregated from cross validation
validation_results = model.results("validation")

The lower-level API, in particular the large section of time series transformers in the scikit-learn style, can also be utilized independently from the AutoML framework.

Check out extended_tutorial.md for a more detailed guide to features!

Also take a look at the production_example.py

Tips for Speed and Large Data:

Use appropriate model lists, especially the predefined lists:
- superfast (simple naive models) and fast (more complex but still faster models)
- fast_parallel (a combination of fast and parallel) or parallel, given many CPU cores are available
  - n_jobs usually gets pretty close with ='auto' but adjust as necessary for the environment
- see a dict of predefined lists (some defined for internal use) with from autots.models.model_list import model_lists
Use the subset parameter when there are many similar series, subset=100 will often generalize well for tens of thousands of similar series.
- if using subset, passing weights for series will weight subset selection towards higher priority series.
- if limited by RAM, it can be easily distributed by running multiple instances of AutoTS on different batches of data, having first imported a template pretrained as a starting point for all.
Set model_interrupt=True which passes over the current model when a KeyboardInterrupt ie crtl+c is pressed (although if the interrupt falls between generations it will stop the entire training).
Use the result_file method of .fit() which will save progress after each generation - helpful to save progress if a long training is being done. Use import_results to recover.
While Transformations are pretty fast, setting transformer_max_depth to a lower number (say, 2) will increase speed. Also utilize transformer_list.
Ensembles are obviously slower to predict because they run many models, 'distance' models 2x slower, and 'simple' models 3x-5x slower.
- ensemble='horizontal-max' with model_list='no_shared_fast' can scale relatively well given many cpu cores because each model is only run on the series it is needed for.
Reducing num_validations and models_to_validate will decrease runtime but may lead to poorer model selections.
For datasets with many records, upsampling (for example, from daily to monthly frequency forecasts) can reduce training time if appropriate.
- this can be done by adjusting frequency and aggfunc but is probably best done before passing data into AutoTS.

How to Contribute:

Give feedback on where you find the documentation confusing
Use AutoTS and...
- Report errors and request features by adding Issues on GitHub
- Posting the top model templates for your data (to help improve the starting templates)
- Feel free to recommend different search grid parameters for your favorite models
And, of course, contributing to the codebase directly on GitHub!

Also known as Project CATS (Catlin's Automated Time Series) hence the logo.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.6.16b1 pre-release

Oct 15, 2024

0.6.15

Jul 22, 2024

0.6.14

May 17, 2024

0.6.13

May 14, 2024

0.6.12

May 6, 2024

0.6.11

Apr 8, 2024

0.6.10

Jan 30, 2024

0.6.9

Jan 22, 2024

0.6.8

Jan 18, 2024

0.6.7

Jan 3, 2024

0.6.6

Dec 20, 2023

0.6.5

Dec 19, 2023

0.6.4

Dec 11, 2023

0.6.3

Nov 28, 2023

0.6.2

Nov 3, 2023

0.6.1

Oct 4, 2023

0.6.0

Aug 7, 2023

0.5.8

Jul 6, 2023

0.5.7

May 23, 2023

0.5.6

Apr 10, 2023

0.5.5

Apr 3, 2023

0.5.4

Feb 2, 2023

0.5.3

Dec 23, 2022

0.5.2

Dec 13, 2022

0.5.1

Nov 15, 2022

0.5.0

Aug 24, 2022

0.4.2

Jun 20, 2022

0.4.1

May 16, 2022

0.4.0

Feb 7, 2022

0.3.13a9 pre-release

Dec 27, 2021

0.3.12

Dec 5, 2021

0.3.11

Nov 29, 2021

0.3.10

Nov 4, 2021

0.3.9

Oct 31, 2021

0.3.8

Oct 17, 2021

0.3.7

Oct 4, 2021

0.3.6

Sep 14, 2021

0.3.5

Aug 30, 2021

This version

0.3.4

Aug 22, 2021

0.3.3

Aug 5, 2021

0.3.2

Jul 1, 2021

0.3.1

Mar 24, 2021

0.3.0

Jan 24, 2021

0.2.8

Dec 13, 2020

0.2.7

Nov 30, 2020

0.2.6

Oct 25, 2020

0.2.5

Oct 7, 2020

0.2.4

Sep 30, 2020

0.2.3

Sep 23, 2020

0.2.3a1 pre-release

Sep 22, 2020

0.2.2

Jun 28, 2020

0.2.2a1 pre-release

Jun 12, 2020

0.2.1

Jun 5, 2020

0.2.0

May 31, 2020

0.2.0a4 pre-release

May 29, 2020

0.2.0a3 pre-release

May 24, 2020

0.2.0a1 pre-release

May 16, 2020

0.1.5

Mar 9, 2020

0.1.2

Feb 20, 2020

0.1.1

Feb 12, 2020

0.1.0

Feb 4, 2020

0.0.3

Jan 11, 2020

0.0.2

Jan 11, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

AutoTS-0.3.4.tar.gz (393.6 kB view details)

Uploaded Aug 22, 2021 Source

Built Distribution

AutoTS-0.3.4-py3-none-any.whl (408.8 kB view details)

Uploaded Aug 22, 2021 Python 3

File details

Details for the file AutoTS-0.3.4.tar.gz.

File metadata

Download URL: AutoTS-0.3.4.tar.gz
Upload date: Aug 22, 2021
Size: 393.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.7.0 requests/2.25.1 setuptools/57.4.0 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.8.3

File hashes

Hashes for AutoTS-0.3.4.tar.gz
Algorithm	Hash digest
SHA256	`978b5a15e1d77cb65272b36708bbdc728d08be8fba282e5bc4a8ec4321b02614`
MD5	`623442137385034eeae85a802aec32a9`
BLAKE2b-256	`cf28d1f6be29c007752c0352157cf475a05bb44dae0f30d9a7b7efc986777651`

See more details on using hashes here.

Provenance

File details

Details for the file AutoTS-0.3.4-py3-none-any.whl.

File metadata

Download URL: AutoTS-0.3.4-py3-none-any.whl
Upload date: Aug 22, 2021
Size: 408.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.7.0 requests/2.25.1 setuptools/57.4.0 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.8.3

File hashes

Hashes for AutoTS-0.3.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ea82abaf5eb4a9b72a40bdd754e04f9e7e52655eda9f87cd1c875c3c04449dce`
MD5	`9d9ea8ead797f16b0f3aa6c425a21085`
BLAKE2b-256	`cf2301903fc22290bfd87f2d291d0ca80b23200c9023f78150b0282040befd17`