Skip to main content

Forecasting Competitions Datasets (M1, M3, Tourism) for Python

Project description

fcompdata

Forecasting Competitions Datasets - a Python library for loading M and tourism competitions time series datasets (M1, M3, M4, Tourism) with an interface similar to R's Mcomp and Tcomp packages.

Installation

pip install fcompdata

or from github:

pip install git+https://github.com/config-i1/fcompdata

Usage

from fcompdata import M1, M3, Tourism

# Access series by 1-based index (R-style)
series = M3[1]
print(series['x'])    # Training data (numpy array)
print(series['xx'])   # Test data (numpy array)
print(series['h'])    # Forecast horizon
print(series['n'])    # Training data length
print(series['type']) # Series type (yearly, quarterly, monthly, other)

# Attribute access also works
print(series.sn)          # Series name
print(series.description) # Series description

# Filter by frequency type
yearly = M3.subset('yearly')
monthly = M1.subset('monthly')

# Iterate over all series
for series in M3:
    print(series.sn, len(series.x))

# Get series count
print(len(M3))  # 3003

M4 Dataset

The M4 competition dataset contains 100,000 time series and is too large to bundle with the package. It must be downloaded separately before use. The data is sourced from the Monash Time Series Forecasting Repository hosted on Zenodo.

Downloading M4 Data

from fcompdata.download import download_m4

# Download all M4 frequencies (~50MB total, saved to ~/.fcompdata/m4/)
download_m4()

# Or download specific frequencies
download_m4('yearly')     # 23,000 series
download_m4('quarterly')  # 24,000 series
download_m4('monthly')    # 48,000 series
download_m4('weekly')     # 359 series
download_m4('daily')      # 4,227 series
download_m4('hourly')     # 414 series

The data is downloaded once and cached locally in ~/.fcompdata/m4/. Subsequent calls will use the cached files.

Using M4 Data

from fcompdata import M4, load_m4

# Load all M4 series (requires all frequencies to be downloaded)
series = M4[1]

# Load a specific frequency
yearly = load_m4('yearly')
monthly = load_m4('monthly')

# Same interface as other datasets
print(series.x)       # Training data
print(series.xx)      # Test data
print(series.h)       # Forecast horizon
print(series.type)    # 'yearly', 'quarterly', etc.

# Filter and iterate
for s in yearly:
    print(s.sn, len(s.x))

M4 Download Sources

The M4 data files are downloaded from the Monash Time Series Forecasting Repository on Zenodo:

Frequency Zenodo Record Horizon
Yearly zenodo.org/record/4656379 6
Quarterly zenodo.org/record/4656410 8
Monthly zenodo.org/record/4656480 18
Weekly zenodo.org/record/4656522 13
Daily zenodo.org/record/4656548 14
Hourly zenodo.org/record/4656589 48

Cache Management

from fcompdata.download import clear_cache, get_m4_path

# Check if a frequency is downloaded
path = get_m4_path('yearly')  # Returns Path or None

# Clear all downloaded data
clear_cache()

# Clear only M4 data
clear_cache('m4')

Datasets

Bundled Datasets

These datasets are included with the package and available immediately:

Dataset Series Yearly Quarterly Monthly Other
M1 1,001 181 203 617 -
M3 3,003 645 756 1,428 174
Tourism 1,311 518 427 366 -

Downloadable Datasets

These datasets require downloading before use:

Dataset Series Yearly Quarterly Monthly Weekly Daily Hourly
M4 100,000 23,000 24,000 48,000 359 4,227 414

Series Attributes

Each MCompSeries object has the following attributes:

Attribute Type Description
sn str Series name/identifier
x numpy.ndarray Training data (in-sample)
xx numpy.ndarray Test data (out-of-sample)
h int Forecast horizon
n int Length of training data
period int Seasonal period (1, 4, or 12)
type str Series type (yearly/quarterly/monthly/other)
description str Series description

Data Sources

The time series data in this package was imported from the following sources:

  • Mcomp (M1 and M3 data): Hyndman, R.J. (2024). Mcomp: Data from the M-Competitions. R package. CRAN, GitHub
  • Tcomp (Tourism data): Hyndman, R.J. (2016). Tcomp: Data from the 2010 Tourism Forecasting Competition. R package. CRAN, GitHub
  • Monash Time Series Forecasting Repository (M4 data): forecastingdata.org, hosted on Zenodo

References

The datasets were used in the following forecasting competitions:

M1 Competition:

Makridakis, S., Andersen, A., Carbone, R., Fildes, R., Hibon, M., Lewandowski, R., Newton, J., Parzen, E., & Winkler, R. (1982). The accuracy of extrapolation (time series) methods: Results of a forecasting competition. Journal of Forecasting, 1(2), 111–153. doi:10.1002/for.3980010202

M3 Competition:

Makridakis, S., & Hibon, M. (2000). The M3-Competition: Results, conclusions and implications. International Journal of Forecasting, 16(4), 451–476. doi:10.1016/S0169-2070(00)00057-1

M4 Competition:

Makridakis, S., Spiliotis, E., & Assimakopoulos, V. (2020). The M4 Competition: 100,000 time series and 61 forecasting methods. International Journal of Forecasting, 36(1), 54–74. doi:10.1016/j.ijforecast.2019.04.014

Tourism Forecasting Competition:

Athanasopoulos, G., Hyndman, R.J., Song, H., & Wu, D.C. (2011). The tourism forecasting competition. International Journal of Forecasting, 27(3), 822–844. doi:10.1016/j.ijforecast.2010.11.005

Monash Time Series Forecasting Archive:

Godahewa, R., Bergmeir, C., Webb, G.I., Hyndman, R.J., & Montero-Manso, P. (2021). Monash Time Series Forecasting Archive. Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks (NeurIPS Datasets and Benchmarks 2021). arXiv:2105.06643

License

LGPL-3.0-or-later

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fcompdata-0.1.1.tar.gz (1.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fcompdata-0.1.1-py3-none-any.whl (1.2 MB view details)

Uploaded Python 3

File details

Details for the file fcompdata-0.1.1.tar.gz.

File metadata

  • Download URL: fcompdata-0.1.1.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for fcompdata-0.1.1.tar.gz
Algorithm Hash digest
SHA256 63b0d205d8a7a28608510b341070ff33370c56e2012af1749a7d972b17fa32ba
MD5 67ee03c618b70bd19b909148bb0b1f32
BLAKE2b-256 5df5bff9427a71d2b9b767c7f79079a07b40427d0e66251efaa2c7f9b346d7dd

See more details on using hashes here.

File details

Details for the file fcompdata-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: fcompdata-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 1.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for fcompdata-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c1f8b03c1793d8640fb7a0e208b0991cd1beab60c26f5c8b15bd31cdc945e795
MD5 0a08f73032b6279644cebaadc0bd9f31
BLAKE2b-256 9d8167d7c603c5aec1b52f42ff617b4ef6c660cb00237ee8d77256151e81edf6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page