Skip to main content

CrossFormer for multivariate time series forecasting

Project description

Crossformer

What is it?

crossformer is a python package for multivariate time series data forecasting. The original idea is from the paper Crossformer: A Transformer Model for Multivariate Time Series Forecasting. The package is designed to be easy to use and modular, so you can easily extend it to suit your needs. The package is implemented with the lightning framework to reduce the boilerplate code.

Key Features

  • Transformer based model
  • Data Processing
  • Model Training & Evaluation
  • Model Inference

Installation

The source code can be found on GitHub at https://github.com/Sedimark/Surrey_AI.

Binary installer for the latest released version is available on PyPI at https://pypi.org/project/crossformer/.

To install the package, you can use pip:

pip install crossformer

Getting Staterd

This package is designed to be easy to use. Therefore, we implemented with lightning framework to reduce the boilerplate code. The package is designed to be modular, so you can easily extend it to suit your needs. However, there are three key sections that you need to be aware of when using the package. To get started with the package, you can follow the following sections:

Configurations

Configuration files are used to set up the parameters for the model, data and experiment settings. A dict structure will be passed to the package. Therefore, you can select your prefer format for the configuration file. The basic structure of the configuration dict is as follows:

cfg = {
    "data_dim": 8,              # number of features
    "in_len": 24,               # input time length
    "out_len": 24,              # output time length
    "seg_len": 2,               # segment length
    "window_size": 4,           # window size for segment merge
    "factor": 10,               # scaling factor (reduce the computation)
    "model_dim": 256,           # the hiden model dimension
    "feedforward_dim": 512,     # feedforward dimension
    "head_num": 4,              # number of attention heads
    "layer_num": 6,             # number of layers
    "dropout": 0.2,             # dropout rate
    "baseline": False,          # whether to use baseline
    "learning_rate": 0.1,       # learning rate
    "batch_size": 8,            # batch size
    "split": [
        0.7,
        0.2,
        0.1
    ],                          # split ratio for train, validation and test
    "seed": 2024,               # random seed
    "accelerator": "auto",      # accelerator for training (e.g. "gpu", "cpu", "tpu")
    "min_epochs": 1,            # minimum number of epochs
    "max_epochs": 200,          # maximum number of epochs
    "precision": 32,            # precision for training (e.g. 16, 32)
    "patience": 5,              # patience for early stopping
    "num_workers": 31,          # number of workers for data loading
}

You can modify and add the parameters according to your needs.

Data Preparation

Inside the package, we provide the data interface for loading data to the model (trainer). Therefore, you can easily pass your data (pandas.DataFrame) to the data interface. Currently, we only support the 2D data (pandas.DataFrame) for the model. And the data should be values-only, excluding the timestamps, column names and other information (metadata).

Here is an example of randomly generated data:

import pandas as pd
import numpy as np

sample_df = pd.DataFrame(np.random.rand(400, 8)) # randomly generated data 400 time steps and 8 features

Here is an example of how to load your data:

We assume that the data is values-only and in the format of pandas.DataFrame. If you are not sure about your data format, please check the above generated data sample and follow it. Also, please be aware that your configurations should be compatible with your data. For example, the data_dim should be equal to the number of features in your data.

from crossformer.data_tools import DataInterface

dm = data = DataInterface(df, **cfg) # df is the data (pandas.DataFrame) and cfg is the configuration dict

Model Training & Evaluation

The package is implemented with lightning framework to reduce the boilerplate code. If you are not familiar with the lightning framework, please check the lightning documentation for more information. We provide a very simple example of how to use the package for training and evaluation.

We assume that you have alrady installed the package and gone throught the above sections.

from crossformer.data_tools import DataInterface
from crossformer.model.crossformer import CrossFormer
from lightning.pytorch import Trainer

# load the configuration file or use the sample above

# generate random data and initialize the data interface
sample_df = pd.DataFrame(np.random.rand(400, 8))
dm = DataInterface(sample_df, **cfg) 

# initialize the model
model = CrossFormer(**cfg)

# fit the model
trainer = Trainer(
            accelerator=cfg["accelerator"],
            precision=cfg["precision"],
            min_epochs=cfg["min_epochs"],
            max_epochs=cfg["max_epochs"],
            check_val_every_n_epoch=1,
            fast_dev_run=False,
        )
trainer.fit(model, datamodule=dm)

# evaluate the model
trainer.test(model, datamodule=dm)

Additonal Information

We also provide some wrap scripts for the package usage. If you are interested on this, please refer to the GitHub repository for more information.

Acknowledgement

This software has been developed by the University of Surrey under the SEDIMARK(SEcure Decentralised Intelligent Data MARKetplace) project. SEDIMARK is funded by the European Union under the Horizon Europe framework programme [grant no. 101070074]. This project is also partly funded by UK Research and Innovation (UKRI) under the UK government’s Horizon Europe funding guarantee [grant no. 10043699].

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crossformer-1.5.0.tar.gz (21.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

crossformer-1.5.0-py3-none-any.whl (20.7 kB view details)

Uploaded Python 3

File details

Details for the file crossformer-1.5.0.tar.gz.

File metadata

  • Download URL: crossformer-1.5.0.tar.gz
  • Upload date:
  • Size: 21.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.13

File hashes

Hashes for crossformer-1.5.0.tar.gz
Algorithm Hash digest
SHA256 d56d10b6928231989a18421638c4708fac975df9a32a56aa989db95b0121a498
MD5 daa6e37b00b5252c80ae84d44bb76174
BLAKE2b-256 a661e182e4e779f201a6c324906a3fe77a77e46f30ab5287957ad1d1d803fb82

See more details on using hashes here.

File details

Details for the file crossformer-1.5.0-py3-none-any.whl.

File metadata

  • Download URL: crossformer-1.5.0-py3-none-any.whl
  • Upload date:
  • Size: 20.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.13

File hashes

Hashes for crossformer-1.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2cd7030d39aee93eeb5d8f806ded1d106f429155d8ccd9bcb2bf86ff3d26f25c
MD5 0467a8e18e37cf9ca93a8f66435dda22
BLAKE2b-256 ef2876a497785c7ac60183f9e547d6f47e6b6251b16993e5a0bf8ad941b6fe49

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page