Skip to main content

Complete time series forecasting solution: Standard, Intermittent & New Product forecasting with evaluation framework

Project description

Forecaster-AI 🚀

Enterprise-grade time series forecasting with advanced ML models, automated feature engineering, and production MLOps

Python 3.9+ License: MIT Version


📦 Installation

pip install forecaster-ai

Requirements: Python 3.9+


✨ What's New in v0.5.3

🎉 Complete Advanced Forecasting Suite + Production MLOps!

Phase 1: Core Improvements ✅

  • Visualization Fixes - Robust plotting with error handling
  • Enhanced Validation - Comprehensive data quality checks
  • Better Error Messages - Clear, actionable feedback

Phase 2: Automated Feature Engineering ✅

  • TimeSeriesFeatureEngineer - Automated lag, rolling, date, Fourier features
  • Auto-Detection - Intelligent feature selection based on data patterns
  • Built-in Sample Data - 4 datasets with exogenous variables (no external files!)

Phase 3: Advanced Forecasting Models ✅

  • Probabilistic Forecasting - Quantile predictions, prediction intervals
  • Hierarchical Forecasting - Bottom-up, top-down, middle-out reconciliation
  • Multi-Step Strategies - Direct, recursive, DirRec, MIMO approaches

Phase 4: Model Explainability ✅

  • SHAP Integration - Feature importance and model interpretation
  • What-If Analysis - Scenario testing and sensitivity analysis
  • Visual Explanations - Interactive plots and dashboards

Phase 5: Production MLOps ✅

  • A/B Testing Framework - Statistical model comparison
  • Auto-Retraining - Scheduled and drift-triggered retraining
  • Model Registry - Versioning, metadata, and lifecycle management
  • 94% Industry Alignment - Matches best practices from major frameworks

🎯 Quick Start

Option 1: Use Built-in Sample Data (Recommended!)

from forecasting.data import load_retail_sales
from forecasting.models import ARIMAForecaster
from forecasting.core.config import ForecastConfig, PreprocessingConfig

# 1. Load built-in data (no external files needed!)
sales, exog = load_retail_sales(n_periods=365, include_exog=True)

# 2. Configure with correct parameters
config = ForecastConfig(
    horizon=30,
    confidence_level=0.95,
    frequency='D',
    preprocessing=PreprocessingConfig(
        handle_missing='interpolate',
        enable_decomposition=True,  # ✅ CORRECT parameter name
        decomposition_method='stl'
    )
)

# 3. Train model (no trend parameter when d>0)
model = ARIMAForecaster(config, order=(1, 1, 1))
model.fit(sales[:300], X=exog[:300])

# 4. Predict
forecast, conf_int = model.predict(horizon=30, X=exog[300:330])

print("✓ Forecast complete!")
print(f"Forecast shape: {forecast.shape}")

Option 2: Use Your Own Data

import pandas as pd
from forecasting.models import ARIMAForecaster
from forecasting.core.config import ForecastConfig

# Your time series data
data = pd.Series([100, 105, 110, 108, 115, 120, 118, 125, 130, 128])

# Configure forecast
config = ForecastConfig(
    horizon=5,              # Forecast 5 periods ahead
    confidence_level=0.95   # 95% confidence intervals
)

# Create and fit model
forecaster = ARIMAForecaster(config, order=(2, 1, 2))
forecaster.fit(data)

# Generate forecast
forecast, conf_int = forecaster.predict()

print("Forecast:", forecast)
print("Confidence Intervals:", conf_int)

📚 Complete Model Guide

1. ARIMA Forecaster

Best for: Stationary time series, short-term forecasts

from forecasting.models import ARIMAForecaster
from forecasting.core.config import ForecastConfig

config = ForecastConfig(horizon=30)

# Manual parameter specification
forecaster = ARIMAForecaster(
    config=config,
    order=(2, 1, 2),              # (p, d, q)
    seasonal_order=(1, 1, 1, 7)   # (P, D, Q, s) - weekly seasonality
    # ⚠️ Don't use 'trend' parameter when d > 0
)

forecaster.fit(data)
forecast, conf_int = forecaster.predict(horizon=30)

# Get model metrics
metrics = forecaster.get_validation_metrics()
print(f"AIC: {metrics['aic']}, BIC: {metrics['bic']}")

2. Auto-ARIMA Forecaster

Best for: Automatic parameter selection, exploratory analysis

from forecasting.models import AutoARIMAForecaster

config = ForecastConfig(horizon=30)

# Automatic parameter selection
forecaster = AutoARIMAForecaster(
    config=config,
    seasonal=True,
    m=7,                    # Seasonal period (7 for weekly)
    max_p=5,                # Max AR order
    max_q=5,                # Max MA order
    max_d=2,                # Max differencing
    information_criterion='aic',
    stepwise=True           # Faster search
)

forecaster.fit(data)
forecast, conf_int = forecaster.predict()

# See selected parameters
metrics = forecaster.get_validation_metrics()
print(f"Selected order: {metrics['order']}")
print(f"Seasonal order: {metrics['seasonal_order']}")

3. Prophet Forecaster

Best for: Daily/weekly data with strong seasonality, holidays

from forecasting.models import ProphetForecaster

config = ForecastConfig(horizon=90)

forecaster = ProphetForecaster(
    config=config,
    growth='linear',                    # or 'logistic'
    changepoint_prior_scale=0.05,       # Trend flexibility
    seasonality_prior_scale=10.0,       # Seasonality strength
    seasonality_mode='additive',        # or 'multiplicative'
    yearly_seasonality='auto',
    weekly_seasonality='auto',
    daily_seasonality=False
)

forecaster.fit(data)
forecast, conf_int = forecaster.predict(horizon=90)

# ⚠️ IMPORTANT: Add seasonality BEFORE fit()
forecaster.add_seasonality(
    name='monthly',
    period=30.5,
    fourier_order=5
)

# Add holidays
forecaster.add_country_holidays('US')

# Now fit the model
forecaster.fit(data)

4. LSTM Forecaster

Best for: Complex patterns, long sequences, multivariate data

from forecasting.models import LSTMForecaster

config = ForecastConfig(horizon=30, random_seed=42)

forecaster = LSTMForecaster(
    config=config,
    lookback=30,                # Use last 30 points
    hidden_size=64,             # LSTM hidden units
    num_layers=2,               # Number of LSTM layers
    dropout=0.2,                # Dropout rate
    use_attention=True,         # Attention mechanism
    learning_rate=0.001,
    batch_size=32,
    epochs=100,
    early_stopping_patience=10
)

# Fit with validation split
forecaster.fit(data, validation_split=0.2)

# Generate forecast
forecast, conf_int = forecaster.predict(horizon=30)

# Check training metrics
metrics = forecaster.get_validation_metrics()
print(f"Final validation loss: {metrics['final_val_loss']}")
print(f"Epochs trained: {metrics['epochs_trained']}")

# Plot training history
fig = forecaster.plot_training_history()

5. Ensemble Forecaster

Best for: Combining multiple models for better accuracy

from forecasting.models import (
    EnsembleForecaster,
    ARIMAForecaster,
    ProphetForecaster
)

config = ForecastConfig(horizon=30)

# Create individual models
arima = ARIMAForecaster(config, order=(2, 1, 2))
prophet = ProphetForecaster(config)

# Create ensemble
ensemble = EnsembleForecaster(
    config=config,
    models=[arima, prophet],
    weights=[0.6, 0.4],      # 60% ARIMA, 40% Prophet
    aggregation='weighted'    # 'mean', 'median', or 'weighted'
)

# Fit all models
ensemble.fit(data)

# Generate ensemble forecast
forecast, conf_int = ensemble.predict()

# Check model weights
weights = ensemble.get_model_weights()
print(weights)

Automated Feature Engineering (NEW!)

from forecasting.data import TimeSeriesFeatureEngineer

# Create feature engineer
engineer = TimeSeriesFeatureEngineer(
    lag_features=[1, 7, 14, 30],           # Lag periods
    rolling_features=['mean', 'std', 'min', 'max'],
    rolling_windows=[7, 14, 30],           # Rolling windows
    date_features=True,                     # Day, month, quarter, etc.
    fourier_features=True,                  # Seasonality features
    fourier_order=5
)

# Generate features
features_df = engineer.fit_transform(data)

# Use with any model
from forecasting.models import LSTMForecaster
model = LSTMForecaster(config)
model.fit(features_df['target'], X=features_df.drop('target', axis=1))

Built-in Sample Datasets (NEW!)

from forecasting.data import (
    load_retail_sales,           # Retail sales with 9 exog variables
    load_intermittent_demand,    # Sparse demand patterns
    load_hierarchical_data,      # Multi-level hierarchy
    load_multivariate_series     # Multiple related series
)

# No external files needed!
sales, exog = load_retail_sales(n_periods=365, include_exog=True)
print(f"Sales shape: {sales.shape}")
print(f"Exogenous variables: {exog.columns.tolist()}")

🔧 Advanced Features

Time Series Decomposition

from forecasting.data.preprocessors import TimeSeriesDecomposer

# STL Decomposition
decomposer = TimeSeriesDecomposer(method='stl', period=7)
trend, seasonal, residual = decomposer.fit_transform(data)

# Classical Decomposition
decomposer = TimeSeriesDecomposer(method='classical', period=12)
components = decomposer.fit_transform(data)

# Reconstruct original series
reconstructed = decomposer.inverse_transform(trend, seasonal, residual)

Data Preprocessing

from forecasting.data.preprocessors import TimeSeriesPreprocessor

preprocessor = TimeSeriesPreprocessor(
    handle_missing='interpolate',
    handle_outliers=True,
    outlier_method='iqr',
    normalize=True,
    enable_decomposition=True,      # ✅ CORRECT: enable_decomposition
    decomposition_method='stl'
)

# Preprocess data
processed_data = preprocessor.fit_transform(data)

# Inverse transform predictions
original_scale = preprocessor.inverse_transform(predictions)

Intermittent Demand Forecasting

from forecasting.data.special_cases import IntermittentDemandHandler

# For sparse/intermittent data (many zeros)
handler = IntermittentDemandHandler(method='sba')  # or 'croston', 'tsb'
handler.fit(sparse_data)
forecast = handler.predict(horizon=12)

New Product Forecasting

from forecasting.data.special_cases import NewProductHandler

# For products with little/no history
handler = NewProductHandler(method='bootstrap')
handler.fit(similar_products_data)
forecast = handler.predict(horizon=12)

Data Validation

from forecasting.data.validators import TimeSeriesValidator

validator = TimeSeriesValidator()

# Validate data quality
is_valid, errors = validator.validate(data)
if not is_valid:
    print("Validation errors:", errors)

# Check stationarity
is_stationary, p_value = validator.check_stationarity(data)
print(f"Stationary: {is_stationary}, p-value: {p_value}")

# Detect outliers
outliers = validator.detect_outliers(data, method='iqr')
print(f"Found {len(outliers)} outliers")

# Check seasonality
has_seasonality, period = validator.detect_seasonality(data)
print(f"Seasonality: {has_seasonality}, period: {period}")

📊 Model Comparison Example

import numpy as np
from forecasting.models import ARIMAForecaster, ProphetForecaster, LSTMForecaster
from forecasting.core.config import ForecastConfig

# Split data
train_data = data[:-30]
test_data = data[-30:]

config = ForecastConfig(horizon=30)

# Train multiple models
models = {
    'ARIMA': ARIMAForecaster(config, order=(2, 1, 2)),
    'Prophet': ProphetForecaster(config),
    'LSTM': LSTMForecaster(config, lookback=30, epochs=50)
}

results = {}
for name, model in models.items():
    model.fit(train_data)
    forecast, _ = model.predict()
    mae = np.mean(np.abs(forecast.values - test_data.values))
    results[name] = mae
    print(f"{name} MAE: {mae:.2f}")

# Find best model
best_model = min(results, key=results.get)
print(f"\nBest model: {best_model}")

💾 Model Persistence

# Save model
forecaster.save_model('my_model.pkl')

# Load model
from forecasting.models import ARIMAForecaster
loaded_forecaster = ARIMAForecaster.load_model('my_model.pkl')

# Use loaded model
forecast, conf_int = loaded_forecaster.predict(horizon=30)

🎨 Visualization

import matplotlib.pyplot as plt

# Plot forecast
plt.figure(figsize=(12, 6))
plt.plot(data.index, data.values, label='Historical', color='blue')
plt.plot(forecast.index, forecast.values, label='Forecast', color='red')
plt.fill_between(
    conf_int.index,
    conf_int['lower'],
    conf_int['upper'],
    alpha=0.3,
    color='red',
    label='95% CI'
)
plt.legend()
plt.title('Time Series Forecast')
plt.xlabel('Date')
plt.ylabel('Value')
plt.grid(True)
plt.show()

📖 Configuration Options

from forecasting.core.config import ForecastConfig, PreprocessingConfig

# ✅ CORRECT Configuration
config = ForecastConfig(
    horizon=30,                    # Forecast periods
    confidence_level=0.95,         # Confidence interval level
    frequency='D',                 # Data frequency
    random_seed=42,                # Reproducibility
    preprocessing=PreprocessingConfig(
        handle_missing='interpolate',
        handle_outliers=True,
        normalize=True,
        enable_decomposition=True,      # ✅ CORRECT: enable_decomposition
        decomposition_method='stl',
        seasonal_period=7
    )
)

⚠️ Common Configuration Errors

WRONG:

PreprocessingConfig(decompose=True)  # ❌ This parameter doesn't exist!

CORRECT:

PreprocessingConfig(enable_decomposition=True)  # ✅ Use this instead!

See PARAMETER_NAME_FIX.md for complete parameter reference.


🚀 Performance Tips

1. For Large Datasets

# Use Auto-ARIMA with stepwise search
forecaster = AutoARIMAForecaster(config, stepwise=True, max_p=3, max_q=3)

2. For Fast Training

# Reduce LSTM epochs and use early stopping
forecaster = LSTMForecaster(
    config,
    epochs=50,
    early_stopping_patience=5,
    batch_size=64
)

3. For Better Accuracy

# Use ensemble with multiple models
ensemble = EnsembleForecaster(
    config,
    models=[arima, prophet, lstm],
    aggregation='weighted'
)

🔍 Troubleshooting

Issue: "Model not converging"

# Solution: Adjust parameters or preprocess data
from forecasting.data.preprocessors import TimeSeriesPreprocessor

preprocessor = TimeSeriesPreprocessor(
    normalize=True,
    handle_outliers=True
)
clean_data = preprocessor.fit_transform(data)

Issue: "Poor forecast accuracy"

# Solution: Try ensemble or different model
ensemble = EnsembleForecaster(
    config,
    models=[model1, model2, model3],
    aggregation='mean'
)

Issue: "LSTM training too slow"

# Solution: Reduce complexity or use GPU
forecaster = LSTMForecaster(
    config,
    hidden_size=32,      # Reduce from 64
    num_layers=1,        # Reduce from 2
    epochs=30,           # Reduce from 100
    device='cuda'        # Use GPU if available
)

📦 Dependencies

Core dependencies (automatically installed):

  • numpy>=1.24.0
  • pandas>=2.0.0
  • scikit-learn>=1.3.0
  • statsmodels>=0.14.0
  • torch>=2.0.0
  • prophet>=1.1.0
  • pmdarima>=2.0.0

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.


📄 License

This project is licensed under the MIT License.


📧 Contact

Author: Surya Tripathi
Email: suryaec1099@gmail.com


🙏 Acknowledgments

Built with:


📚 Additional Resources


Made with ❤️ by Bob


📊 Evaluation Framework (New in v0.4.0)

Calculate All Metrics

from forecasting.evaluation import ForecastMetrics

# Calculate comprehensive metrics
calculator = ForecastMetrics(
    actual=test_data.values,
    predicted=forecast.values,
    train_data=train_data.values
)

# Get all metrics at once
metrics = calculator.calculate_all(seasonal_period=7)
print(calculator.summary())

# Output:
# MAE: 5.23
# RMSE: 7.45
# MAPE: 8.12%
# SMAPE: 7.89%
# MASE: 0.85 (< 1 means better than naive forecast!)
# R²: 0.92
# Directional Accuracy: 85.5%

Backtest Your Model

from forecasting.evaluation import walk_forward_validation

# Walk-forward validation (expanding window)
results = walk_forward_validation(
    model_factory=lambda: ARIMAForecaster(config, order=(2,1,2)),
    data=data,
    initial_train_size=100,  # Start with 100 samples
    test_size=10,            # Test on 10 samples each fold
    step_size=5,             # Move forward 5 samples
    verbose=True
)

# View results
print(results.summary())
df = results.to_dataframe()  # Convert to DataFrame for analysis

Cross-Validate

from forecasting.evaluation import cross_val_score, TimeSeriesSplit

# Time series cross-validation (no data leakage!)
cv = TimeSeriesSplit(n_splits=5)
scores = cross_val_score(
    model_factory=lambda: ARIMAForecaster(config, order=(2,1,2)),
    data=data,
    cv=cv,
    scoring='mae',
    verbose=True
)

print(f"CV MAE: {scores.mean():.4f} (+/- {scores.std():.4f})")

Compare Multiple Models

from forecasting.evaluation import compare_models

# Train multiple models
arima_forecast, _ = arima_model.predict(horizon=30)
prophet_forecast, _ = prophet_model.predict(horizon=30)
lstm_forecast, _ = lstm_model.predict(horizon=30)

# Compare them
comparison = compare_models(
    actual=test_data,
    predictions={
        'ARIMA': arima_forecast,
        'Prophet': prophet_forecast,
        'LSTM': lstm_forecast
    },
    train_data=train_data,
    seasonal_period=7
)

print(comparison)
# Shows MAE, RMSE, MAPE, MASE for each model, sorted by MAE

🚀 Production Deployment

Complete Workflow

# 1. Train
from forecasting.models import ARIMAForecaster
from forecasting.core.config import ForecastConfig

config = ForecastConfig(horizon=30, confidence_level=0.95)
model = ARIMAForecaster(config, order=(2,1,2))
model.fit(train_data)

# 2. Evaluate
from forecasting.evaluation import walk_forward_validation

results = walk_forward_validation(
    model_factory=lambda: ARIMAForecaster(config, order=(2,1,2)),
    data=data,
    initial_train_size=100,
    test_size=10
)

# Check if acceptable
agg_metrics = results.aggregate_metrics()
if agg_metrics['mean_mape'] < 10:  # 10% threshold
    print("✓ Model ready for production")
else:
    print("✗ Model needs improvement")

# 3. Save
model.save_model('production_model.pkl')

# 4. Deploy (FastAPI example)
from fastapi import FastAPI
app = FastAPI()

@app.post("/forecast")
def forecast_endpoint(horizon: int = 30):
    model = ARIMAForecaster.load_model('production_model.pkl')
    forecast, conf_int = model.predict(horizon=horizon)
    return {
        "forecast": forecast.tolist(),
        "lower_bound": conf_int['lower'].tolist(),
        "upper_bound": conf_int['upper'].tolist()
    }

# 5. Monitor
from forecasting.evaluation import ForecastMetrics

calculator = ForecastMetrics(actual_data, forecast_data)
metrics = calculator.calculate_all()
if metrics['mape'] > 15:  # Performance degraded
    print("⚠️ Retrain recommended")

See DEPLOYMENT_GUIDE.md for complete deployment documentation.


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

forecaster_ai-0.5.7.tar.gz (131.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

forecaster_ai-0.5.7-py3-none-any.whl (157.6 kB view details)

Uploaded Python 3

File details

Details for the file forecaster_ai-0.5.7.tar.gz.

File metadata

  • Download URL: forecaster_ai-0.5.7.tar.gz
  • Upload date:
  • Size: 131.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for forecaster_ai-0.5.7.tar.gz
Algorithm Hash digest
SHA256 76496ab4b9448a808d3d98392353e9b6d2f26344af1abee8b145413d64b209c2
MD5 637317e4605e4b1a276442e0601acdc3
BLAKE2b-256 33fb12e9ec4eebde80816dc43dec184ca2719bf18b61fa08b5130acbf3143d33

See more details on using hashes here.

File details

Details for the file forecaster_ai-0.5.7-py3-none-any.whl.

File metadata

  • Download URL: forecaster_ai-0.5.7-py3-none-any.whl
  • Upload date:
  • Size: 157.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for forecaster_ai-0.5.7-py3-none-any.whl
Algorithm Hash digest
SHA256 00a7faf852cade1d0fca076176e3721d5782c6692e4899545d723fa9a0d468ce
MD5 6eb22a4bb3b44da2bac98c592d537d77
BLAKE2b-256 12bd7fa16abed42fcfecdce9644a8b24c34413f4c6297669d069917b06378adc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page