Skip to main content

fev: Forecast evaluation library

Project description

fev

A lightweight library that makes it easy to benchmark time series forecasting models.

  • Extensible: Easy to define your own forecasting tasks and benchmarks.
  • Reproducible: Ensures that the results obtained by different users are comparable.
  • Easy to use: Compatible with most popular forecasting libraries.
  • Minimal dependencies: Just a thin wrapper on top of 🤗datasets.

How is fev different from other benchmarking tools?

Existing forecasting benchmarks usually fall into one of two categories:

  • Standalone datasets without any supporting infrastructure. These provide no guarantees that the results obtained by different users are comparable. For example, changing the start date or duration of the forecast horizon totally changes the meaning of the scores.
  • Bespoke end-to-end systems that combine models, datasets and forecasting tasks. Such packages usually come with lots of dependencies and assumptions, which makes extending or integrating these libraries into existing systems difficult.

fev aims for the middle ground - it provides the core benchmarking functionality without introducing unnecessary constraints or bloated dependencies. The library supports point & probabilistic forecasting, different types of covariates, as well as all popular forecasting metrics.

Installation

pip install fev

Quickstart

Create a task from a dataset stored on Hugging Face Hub

import fev

task = fev.Task(
    dataset_path="autogluon/chronos_datasets",
    dataset_config="monash_kdd_cup_2018",
    horizon=12,
)

Load data available as input to the forecasting model

past_data, future_data = task.get_input_data()
  • past_data contains the past data before the forecast horizon (item ID, past timestamps, target, all covariates).
  • future_data contains future data that is known at prediction time (item ID, future timestamps, and known covariates)

Make predictions

def naive_forecast(y: list, horizon: int) -> list:
    return [y[-1] for _ in range(horizon)]

predictions = []
for ts in past_data:
    predictions.append(
        {"predictions": naive_forecast(y=ts[task.target_column], horizon=task.horizon)}
    )

Get an evaluation summary

task.evaluation_summary(predictions, model_name="naive")
# {'model_name': 'naive',
#  'dataset_name': 'chronos_datasets_monash_kdd_cup_2018',
#  'dataset_path': 'autogluon/chronos_datasets',
#  'dataset_config': 'monash_kdd_cup_2018',
#  'horizon': 12,
#  'cutoff': -12,
#  'lead_time': 1,
#  'min_context_length': 1,
#  'max_context_length': None,
#  'seasonality': 1,
#  'eval_metric': 'MASE',
#  'extra_metrics': [],
#  'quantile_levels': None,
#  'id_column': 'id',
#  'timestamp_column': 'timestamp',
#  'target_column': 'target',
#  'generate_univariate_targets_from': None,
#  'past_dynamic_columns': [],
#  'excluded_columns': [],
#  'test_error': 3.3784518866750513,
#  'training_time_s': None,
#  'inference_time_s': None,
#  'dataset_fingerprint': 'a22d13d4c1e8641c',
#  'trained_on_this_dataset': False,
#  'fev_version': '0.5.0',
#  'MASE': 3.3784518866750513}

The evaluation summary contains all information necessary to uniquely identify the forecasting task.

Multiple evaluation summaries produced by different models on different tasks can be aggregated into a single table.

# Dataframes, dicts, JSON or CSV files supported
summaries = "https://raw.githubusercontent.com/autogluon/fev/refs/heads/main/benchmarks/example/results/results.csv"
fev.leaderboard(summaries)
# | model_name     |   gmean_relative_error |   avg_rank |   avg_inference_time_s |   ... |
# |:---------------|-----------------------:|-----------:|-----------------------:|------:|
# | auto_theta     |                  0.874 |      2     |                  5.501 |   ... |
# | auto_arima     |                  0.887 |      2     |                 21.799 |   ... |
# | auto_ets       |                  0.951 |      2.667 |                  0.737 |   ... |
# | seasonal_naive |                  1     |      3.333 |                  0.004 |   ... |

Tutorials

  • Quickstart: Define a task and evaluate a model.
  • Datasets: Use fev with your own datasets.
  • Tasks & benchmarks: Advanced features for defining tasks and benchmarks.
  • Models: Evaluate your models and submit results to the leaderboard.

Examples of model implementations compatible with fev are available in examples/.

Leaderboards

We host leaderboards obtained using fev under https://huggingface.co/spaces/autogluon/fev-leaderboard.

Currently, the leaderboard includes the results from the Benchmark II introduced in Chronos: Learning the Language of Time Series. We expect to extend this list in the future.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fev-0.5.0rc1.tar.gz (54.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fev-0.5.0rc1-py3-none-any.whl (33.4 kB view details)

Uploaded Python 3

File details

Details for the file fev-0.5.0rc1.tar.gz.

File metadata

  • Download URL: fev-0.5.0rc1.tar.gz
  • Upload date:
  • Size: 54.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for fev-0.5.0rc1.tar.gz
Algorithm Hash digest
SHA256 cb6864397e85dc9a91ae394df413d9a855c22b726bedbf0b3656ec0a3c46e2e5
MD5 2925ac704a0a3ce88c65b06551c45bb3
BLAKE2b-256 4145b0a4fe4476288b5ff045fdcd2bf5ea46e649350a74d70e50994e9e1a334b

See more details on using hashes here.

Provenance

The following attestation bundles were made for fev-0.5.0rc1.tar.gz:

Publisher: publish-to-pypi.yml on autogluon/fev

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fev-0.5.0rc1-py3-none-any.whl.

File metadata

  • Download URL: fev-0.5.0rc1-py3-none-any.whl
  • Upload date:
  • Size: 33.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for fev-0.5.0rc1-py3-none-any.whl
Algorithm Hash digest
SHA256 8e54c18a3187413b84439fb3a6be9c4c95528d4637cb92c0388b1066462ec49b
MD5 755f91499e0d445483759c801bd0ec92
BLAKE2b-256 34bc66fefe73ccedfa043958f4abcfe6ed902db821dc63c05f52ba2acd2e175b

See more details on using hashes here.

Provenance

The following attestation bundles were made for fev-0.5.0rc1-py3-none-any.whl:

Publisher: publish-to-pypi.yml on autogluon/fev

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page