A transformer-based model for time series forecasting inspired by modern attention mechanisms

These details have not been verified by PyPI

Project links

Project description

Temporal: Transformer-Based Time Series Forecasting

A PyTorch implementation of a transformer-based model for time series forecasting, inspired by modern attention-based approaches.

Overview

Temporal is a foundational model for time series forecasting based on the revolutionary self-attention mechanism introduced in "Attention is All You Need". Unlike language models, Temporal is specifically designed and trained to minimize forecasting error on time series data.

Key Features

Self-Attention Mechanism: Captures complex temporal dependencies and patterns
Encoder-Decoder Architecture: Multi-layer transformer with residual connections and layer normalization
Flexible: Supports both univariate and multivariate time series
Scalable: Can handle various sequence lengths and forecasting horizons
Autoregressive Generation: Inference mode for multi-step ahead forecasting

Architecture

The Temporal model consists of:

Input Embedding: Projects time series data to model dimension
Positional Encoding: Captures temporal order (sinusoidal or learnable)
Encoder Stack: Multiple layers of self-attention and feed-forward networks
Decoder Stack: Multiple layers with self-attention, cross-attention, and feed-forward networks
Output Projection: Maps decoder output to forecasting window dimension

Architecture Diagram

graph TD
    A[Input Time Series<br/>batch, lookback, features] --> B[Input Embedding<br/>Linear: features → d_model]
    B --> C[Positional Encoding<br/>Add temporal position info]
    C --> D[Encoder Stack<br/>6 layers]
    D --> E[Encoder Output<br/>batch, lookback, d_model]

    F[Decoder Input<br/>Previous predictions] --> G[Input Embedding<br/>Linear: features → d_model]
    G --> H[Positional Encoding]
    H --> I[Decoder Stack<br/>6 layers]
    E --> I
    I --> J[Decoder Output<br/>batch, horizon, d_model]
    J --> K[Output Projection<br/>Linear: d_model → features]
    K --> L[Forecast<br/>batch, horizon, features]

    style A fill:#e1f5ff
    style L fill:#e1ffe1
    style D fill:#fff4e1
    style I fill:#ffe1f5

Each layer includes:

Multi-head self-attention
Residual connections
Layer normalization
Feed-forward networks with GELU activation

For more diagrams, see DIAGRAMS.md - complete visual documentation with:

Encoder/Decoder architecture
Multi-head attention mechanism
Training and inference flows
Data pipeline
Component interactions

Installation

From PyPI

pip install temporal-forecasting

With HuggingFace Support

pip install temporal-forecasting[huggingface]

This adds support for:

Uploading models to HuggingFace Hub
Downloading models from HuggingFace Hub
HuggingFace ecosystem integration

With Data Fetching Support

pip install temporal-forecasting[data]

This adds support for:

Fetching stock prices from Yahoo Finance
Fetching cryptocurrency data (Bitcoin, Ethereum, etc.)
Downloading datasets from Kaggle
Technical indicators (SMA, RSI, MACD, Bollinger Bands)
Data preprocessing utilities

From Source

git clone https://github.com/OptimalMatch/temporal.git
cd temporal
pip install -r requirements.txt
pip install -e .

Requirements

Python >= 3.8
PyTorch >= 2.0.0
NumPy >= 1.20.0
tqdm >= 4.60.0
matplotlib >= 3.3.0

Optional Dependencies

HuggingFace: transformers>=4.30.0, huggingface-hub>=0.16.0
Data Fetching: yfinance>=0.2.0, pandas>=1.3.0, scikit-learn>=1.0.0, kagglehub>=0.2.0

Quick Start

Basic Usage

import torch
from temporal import Temporal

# Create model
model = Temporal(
    input_dim=1,           # Univariate time series
    d_model=256,           # Model dimension
    num_encoder_layers=4,  # Number of encoder layers
    num_decoder_layers=4,  # Number of decoder layers
    num_heads=8,           # Attention heads
    d_ff=1024,            # Feed-forward dimension
    forecast_horizon=24,   # Predict 24 steps ahead
    dropout=0.1
)

# Input: (batch_size, sequence_length, input_dim)
x = torch.randn(32, 96, 1)

# Generate forecast
forecast = model.forecast(x)  # (32, 24, 1)

Training Example

from temporal import Temporal
from temporal.trainer import TimeSeriesDataset, TemporalTrainer
from torch.utils.data import DataLoader
import torch

# Prepare your data
train_data = ...  # Shape: (num_samples, num_features)

# Create dataset
dataset = TimeSeriesDataset(
    train_data,
    lookback=96,
    forecast_horizon=24,
    stride=1
)

# Create data loader
train_loader = DataLoader(dataset, batch_size=32, shuffle=True)

# Create model
model = Temporal(
    input_dim=train_data.shape[1],
    d_model=256,
    num_encoder_layers=4,
    num_decoder_layers=4,
    num_heads=8,
    d_ff=1024,
    forecast_horizon=24
)

# Create optimizer
optimizer = torch.optim.AdamW(model.parameters(), lr=1e-4)

# Create trainer
trainer = TemporalTrainer(
    model=model,
    optimizer=optimizer,
    criterion=torch.nn.MSELoss()
)

# Train
history = trainer.fit(
    train_loader=train_loader,
    num_epochs=100,
    early_stopping_patience=10,
    save_path="best_model.pt"  # Automatically saves best model
)

Saving and Loading Models

# Save trained model
torch.save(model.state_dict(), 'temporal_model.pt')

# Load model for inference
model = Temporal(input_dim=1, forecast_horizon=24)
model.load_state_dict(torch.load('temporal_model.pt'))
model.eval()

# Make predictions
forecast = model.forecast(x)

For complete guide on model persistence, see MODEL_PERSISTENCE.md.

Examples

Univariate Time Series

See examples/basic_usage.py for a complete example with synthetic data:

cd examples
python basic_usage.py

This will:

Generate synthetic time series data
Train a Temporal model
Generate forecasts
Visualize results

Multivariate Time Series

See examples/multivariate_example.py for forecasting multiple correlated features:

cd examples
python multivariate_example.py

Model Persistence

See examples/model_persistence_example.py for saving and loading trained models:

cd examples
python model_persistence_example.py

This demonstrates:

Training and saving a model with all components
Loading saved models for inference
Production-ready model deployment

HuggingFace Integration

See examples/huggingface_example.py for HuggingFace Hub integration:

cd examples
python huggingface_example.py

This demonstrates:

Creating HuggingFace-compatible models
Saving in HuggingFace format
Loading from HuggingFace Hub
Uploading models to share with the community

For complete guide, see HUGGINGFACE_INTEGRATION.md.

Stock Price Forecasting

See examples/stock_forecasting.py for real stock data forecasting:

cd examples
python stock_forecasting.py

This demonstrates:

Fetching stock data from Yahoo Finance
Training on Apple (AAPL) stock prices
5-day price forecasting
Model evaluation and visualization

Cryptocurrency Forecasting

See examples/crypto_forecasting.py for Bitcoin and crypto forecasting:

cd examples
python crypto_forecasting.py

This demonstrates:

Fetching Bitcoin data
Training on cryptocurrency prices
7-day price forecasting
Multi-crypto comparison

For complete guide on data fetching, see DATA_SOURCES.md.

Reference Implementations

The following projects demonstrate real-world applications built using the Temporal forecasting library. These implementations showcase how to integrate Temporal into production systems and can serve as templates for your own projects.

Temporal Trading Agents

Repository: github.com/OptimalMatch/temporal-trading-agents

A next-generation trading system that combines deep learning time-series forecasting with ensemble methods and multi-strategy consensus voting to predict market movements and generate trading signals.

Features

Multi-Horizon Forecasting: Separate ensembles for 3-day, 7-day, 14-day, and 21-day predictions
Ensemble Learning: Combines 5-8 models per time horizon with confidence quantification
8-Strategy Consensus System: Analyzes predictions using gradient analysis, confidence weighting, volatility sizing, momentum, swing trading, risk-adjusted metrics, mean reversion, and multi-timeframe alignment
Production-Ready Platform: React dashboard with FastAPI backend, MongoDB, and Docker deployment
Risk Management: Dynamic position sizing, VaR calculations, and Sortino ratio analysis

Using Temporal in Your Project

Add Temporal to your requirements.txt:

temporal-forecasting>=0.3.1

Example usage from the trading agents implementation:

from temporal import Temporal, TemporalTrainer, TimeSeriesDataset
from temporal.data_sources import fetch_crypto_data
import torch

# Fetch cryptocurrency data
data = fetch_crypto_data('BTC-USD', period='2y')

# Create and train ensemble of models for different horizons
horizons = [3, 7, 14, 21]  # days
models = {}

for horizon in horizons:
    # Prepare dataset
    dataset = TimeSeriesDataset(
        data,
        lookback=96,
        forecast_horizon=horizon * 24,  # Convert days to hours
        stride=1
    )

    # Create model
    model = Temporal(
        input_dim=data.shape[1],
        d_model=256,
        num_encoder_layers=4,
        num_decoder_layers=4,
        num_heads=8,
        forecast_horizon=horizon * 24
    )

    # Train model
    trainer = TemporalTrainer(model, optimizer=torch.optim.AdamW(model.parameters()))
    history = trainer.fit(train_loader, num_epochs=100)

    models[f'{horizon}d'] = model

# Generate multi-horizon forecasts
forecasts = {}
for horizon, model in models.items():
    forecast = model.forecast(recent_data)
    forecasts[horizon] = forecast

# Use forecasts for trading strategy consensus voting
# (See temporal-trading-agents for full strategy implementation)

Learn More

Documentation: See the temporal-trading-agents README
Live Demo: Follow the Docker setup instructions for a complete trading dashboard
Strategies: Review the 8-strategy consensus voting system for signal generation

Contributing Your Implementation

Have you built something with Temporal? We'd love to feature your project! Submit a pull request adding your implementation to this section, including:

Project description and repository link
Key features and use cases
Code example showing Temporal integration
Any unique approaches or optimizations

Model Configuration

Parameters

Parameter	Description	Default
`input_dim`	Number of input features	1
`d_model`	Model dimension	512
`num_encoder_layers`	Number of encoder layers	6
`num_decoder_layers`	Number of decoder layers	6
`num_heads`	Number of attention heads	8
`d_ff`	Feed-forward dimension	2048
`forecast_horizon`	Number of steps to forecast	24
`max_seq_len`	Maximum sequence length	5000
`dropout`	Dropout probability	0.1
`use_learnable_pe`	Use learnable positional encoding	False

Recommended Configurations

Small Model (Fast training, lower accuracy):

model = Temporal(
    d_model=128,
    num_encoder_layers=2,
    num_decoder_layers=2,
    num_heads=4,
    d_ff=512
)

Medium Model (Balanced):

model = Temporal(
    d_model=256,
    num_encoder_layers=4,
    num_decoder_layers=4,
    num_heads=8,
    d_ff=1024
)

Large Model (Best accuracy, slower training):

model = Temporal(
    d_model=512,
    num_encoder_layers=6,
    num_decoder_layers=6,
    num_heads=16,
    d_ff=2048
)

Training Tips

Learning Rate: Start with 1e-4 and use a scheduler (e.g., ReduceLROnPlateau)
Batch Size: Use the largest batch size that fits in memory (32-128)
Gradient Clipping: Use gradient clipping (0.5-1.0) to prevent exploding gradients
Early Stopping: Monitor validation loss and stop when it plateaus
Data Normalization: Normalize your data (e.g., StandardScaler) before training

Architecture Details

Multi-Head Attention

The model uses scaled dot-product attention:

Attention(Q, K, V) = softmax(QK^T / √d_k)V

Multiple attention heads allow the model to attend to different aspects of the time series simultaneously.

Positional Encoding

Two types of positional encoding are available:

Sinusoidal (default): Fixed sinusoidal functions
Learnable: Learned embeddings for each position

Autoregressive Generation

During inference, the model generates forecasts autoregressively:

Start with the last observed value
Generate next step prediction
Use prediction as input for next step
Repeat for entire forecast horizon

Modern Time Series Transformers

Temporal implements a transformer architecture similar to modern approaches in time series forecasting:

Feature	Modern Approaches	Temporal
Architecture	Transformer	Transformer
Attention	Multi-head	Multi-head
Layers	Encoder-Decoder	Encoder-Decoder
Training	Large-scale pre-training	User-provided data
Flexibility	Fixed models	Fully customizable

Performance

Performance varies by dataset and configuration. Typical metrics on benchmark datasets:

MSE: 0.01-0.1 (normalized data)
MAE: 0.05-0.3 (normalized data)
Training Time: 1-10 minutes per epoch (depending on size)

API Reference

Temporal

Main model class for time series forecasting.

Methods:

forward(src, tgt=None, src_mask=None, tgt_mask=None): Forward pass
forecast(x, horizon=None): Generate forecasts
generate_causal_mask(size): Create causal attention mask

TemporalTrainer

Training utilities for Temporal models.

Methods:

train_epoch(dataloader): Train for one epoch
validate(dataloader): Validate the model
fit(train_loader, val_loader, num_epochs, ...): Full training loop
predict(dataloader): Generate predictions

TimeSeriesDataset

Dataset class for time series data.

Parameters:

data: Time series data array
lookback: Number of historical steps
forecast_horizon: Number of future steps
stride: Stride for sliding window

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Important: By contributing to this project, you agree to the terms of the Contributor Assignment Agreement (CAA), which assigns copyright of your contributions to Unidatum Integrated Products LLC. Please include the CAA statement in your pull request.

License

This project is licensed under the GNU General Public License v3.0 (GPLv3) - see the LICENSE file for details.

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

Commercial Licensing

Companies that cannot release their source code under the GPLv3 may purchase a commercial license from Unidatum Integrated Products LLC. A commercial license grants the right to use this software in closed-source, proprietary projects without the requirement to disclose source code.

For commercial licensing inquiries, please contact: licensing@unidatum.com

Patents

This software is subject to a pending patent application:

Transformer-based Time Series Forecasting System and Method US Patent Application No. 63/910,189 (Filed November 3, 2025)

The patent covers specific methods and systems related to transformer-based time series forecasting. Use of this software under the GPLv3 license includes the patent license provisions specified in Section 11 of the GPLv3. For more information, see the PATENTS file.

Citation

If you use this code in your research, please cite:

@software{temporal2024,
  title = {Temporal: Transformer-Based Time Series Forecasting},
  year = {2024},
  note = {A PyTorch implementation of transformer architecture for time series},
  url = {https://github.com/OptimalMatch/temporal}
}

References

Vaswani et al., "Attention is All You Need" (2017)
Modern transformer-based time series forecasting approaches

Acknowledgments

This implementation is inspired by modern transformer architectures for time series forecasting and the original Transformer paper.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.3.4

Nov 11, 2025

0.3.3

Nov 8, 2025

0.3.2

Nov 8, 2025

0.3.1

Oct 27, 2025

0.3.0

Oct 27, 2025

0.2.0

Oct 27, 2025

0.1.0

Oct 27, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

temporal_forecasting-0.3.4.tar.gz (50.9 kB view details)

Uploaded Nov 11, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

temporal_forecasting-0.3.4-py3-none-any.whl (43.4 kB view details)

Uploaded Nov 11, 2025 Python 3

File details

Details for the file temporal_forecasting-0.3.4.tar.gz.

File metadata

Download URL: temporal_forecasting-0.3.4.tar.gz
Upload date: Nov 11, 2025
Size: 50.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.0rc1

File hashes

Hashes for temporal_forecasting-0.3.4.tar.gz
Algorithm	Hash digest
SHA256	`e73d0b02dbf0adf13845d76ee3de92194770027e50fa8ccf2d3f893c9862723c`
MD5	`7962739d2c9c2c3fd3101bedb0568426`
BLAKE2b-256	`072b7325904a04abddb6401b79ccb5886aa4476527f0f34a09d2e7d254e88ea7`

See more details on using hashes here.

File details

Details for the file temporal_forecasting-0.3.4-py3-none-any.whl.

File metadata

Download URL: temporal_forecasting-0.3.4-py3-none-any.whl
Upload date: Nov 11, 2025
Size: 43.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.0rc1

File hashes

Hashes for temporal_forecasting-0.3.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fea17e8604e48b108347f1e708b9568f63369057de72ef5f1284de96be78f0f3`
MD5	`a54c383ec423a956e4f6d9a863ebf43a`
BLAKE2b-256	`1ed423b971fd41b0730698e8f579336739f3757d222db81e9bca126b6862a5e4`

See more details on using hashes here.

temporal-forecasting 0.3.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Temporal: Transformer-Based Time Series Forecasting

Overview

Key Features

Architecture

Architecture Diagram

Installation

From PyPI

With HuggingFace Support

With Data Fetching Support

From Source

Requirements

Optional Dependencies

Quick Start

Basic Usage

Training Example

Saving and Loading Models

Examples

Univariate Time Series

Multivariate Time Series

Model Persistence

HuggingFace Integration

Stock Price Forecasting

Cryptocurrency Forecasting

Reference Implementations

Temporal Trading Agents

Features

Using Temporal in Your Project

Learn More

Contributing Your Implementation

Model Configuration

Parameters

Recommended Configurations

Training Tips

Architecture Details

Multi-Head Attention

Positional Encoding

Autoregressive Generation

Modern Time Series Transformers

Performance

API Reference

Temporal

TemporalTrainer

TimeSeriesDataset

Contributing

License

Commercial Licensing

Patents

Citation

References

Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes