SVD-based time series imputation with uncertainty estimation

These details have not been verified by PyPI

Project links

Project description

SVD Time Series Imputer

A Python package for time series imputation using Singular Value Decomposition (SVD) with automatic rank estimation and uncertainty quantification.

📦 Now available on PyPI: pip install svd-imputer

Installation
Quick Start
Usage
Examples
API Reference
Requirements

A Python package for time series imputation using SVD with automatic rank estimation, uncertainty quantification, and scikit-learn compatible API.

Installation

PyPI (Recommended):

pip install svd-imputer

From Source (development version):

git clone https://github.com/rhugman/svd_imputer.git
cd svd_imputer
pip install -e .

With Development Dependencies:

pip install -e ".[dev]"

Quick Start

import pandas as pd
import numpy as np
from svd_imputer import Imputer

# Load your time series data (with datetime index)
df = pd.read_csv("your_data.csv", index_col=0, parse_dates=True)

# Simple imputation with automatic rank estimation
imputer = Imputer(data=df, variance_threshold=0.95)
df_imputed = imputer.fit_transform()

# With uncertainty estimation  
df_imputed, uncertainty = imputer.fit_transform(return_uncertainty=True)
print(f"RMSE: {uncertainty['rmse']:.3f} ± {uncertainty['rmse_std']:.3f}")

Note: The Imputer class uses a data-centric design where data is provided at initialization and preprocessed once. This ensures consistency across all analyses and eliminates redundant preprocessing operations.

Usage

from svd_imputer import Imputer

# Basic imputation (automatic rank estimation)
imputer = Imputer(data=df, variance_threshold=0.95)
df_imputed = imputer.fit_transform()

# Cross-validation optimization
imputer = Imputer(data=df, rank="auto")
imputer.fit()
print(f"Optimized rank: {imputer.rank_}")

# With uncertainty estimation
df_imputed, uncertainty = imputer.fit_transform(return_uncertainty=True)
print(f"RMSE: {uncertainty['rmse']:.3f} ± {uncertainty['rmse_std']:.3f}")

# Advanced: model diagnostics
residuals, stats = imputer.calculate_reconstruction_residuals(return_stats=True)
print(f"Reconstruction R²: {stats['r_squared']:.3f}")

Configuration

imputer = Imputer(
    data=df,                    # Input DataFrame (required)
    variance_threshold=0.95,    # Variance threshold for auto rank estimation
    rank=None,                  # None (auto-estimate), int (fixed), or "auto" (optimize)
    max_iters=500,             # Maximum SVD iterations
    tol=1e-4,                  # Convergence tolerance  
    verbose=True               # Enable logging output
)

Examples

Complete examples are available in the examples/ directory:

basic_example.ipynb - Basic usage and quick start tutorial
augmented_example.ipynb - Extended examples with data agumentation features

How It Works

The algorithm performs iterative SVD imputation with automatic rank estimation:

Preprocessing: Data validation, standardization, and missing value handling
Rank Estimation: Variance threshold, cross-validation, or fixed rank
SVD Imputation: Iterative low-rank approximation until convergence
Uncertainty Estimation: Monte Carlo validation with temporal or random masking

API Reference

Main Class

Imputer(data, variance_threshold=0.95, rank=None, max_iters=500, tol=1e-4, verbose=True)

Key Methods

fit() / transform() / fit_transform(): Standard sklearn interface
estimate_uncertainty(): Monte Carlo validation
calculate_reconstruction_residuals(): Model diagnostics
project_data() / reconstruct_data(): SVD subspace operations

Requirements

Python >= 3.8
numpy >= 1.20.0
pandas >= 1.3.0
scikit-learn >= 1.0.0

Performance Notes

Memory: O(n × m) for data size n×m, plus O(min(n,m)²) for SVD decomposition
Time Complexity: O(k × min(n,m)³) where k is the number of SVD iterations
Recommended Scale: Efficient for datasets up to ~10,000 × 100 dimensions
Optimization: SVD components are cached for efficient reuse across operations

Package Status

Current Status: Published on PyPI 🎉

This package is currently in Beta - the core functionality is stable and tested (86 tests passing), but the API may evolve. Suitable for research and development use.

Disclaimer

IMPORTANT: This software is provided "as is" without warranty of any kind. The authors and contributors make no representations or warranties regarding the accuracy, completeness, or validity of the code or its results. Users are solely responsible for validating the appropriateness and correctness of this software for their specific use cases. The authors assume no responsibility or liability for any errors, omissions, or damages arising from the use of this software.

License

MIT License

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Citation

If you use this package in your research, please cite:

@software{svd_time_series_imputer,
  title={SVD Time Series Imputer: A Python Package for Missing Data Imputation},
  author={Rui Hugman},
  year={2025},
  url={https://github.com/rhugman/svd_imputer},
  note={Available on PyPI: https://pypi.org/project/svd-imputer/}
}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.7

Jan 5, 2026

0.1.2

Nov 12, 2025

This version

0.1.1

Nov 12, 2025

0.1.0

Nov 12, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

svd_imputer-0.1.1.tar.gz (60.4 kB view details)

Uploaded Nov 12, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

svd_imputer-0.1.1-py3-none-any.whl (22.4 kB view details)

Uploaded Nov 12, 2025 Python 3

File details

Details for the file svd_imputer-0.1.1.tar.gz.

File metadata

Download URL: svd_imputer-0.1.1.tar.gz
Upload date: Nov 12, 2025
Size: 60.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for svd_imputer-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`da1c6a49378cc5e820b11a1a2b7ebfbc4f2fb3b4b5e8170d7ee8255de38bc601`
MD5	`7d9217364567f8248eed4958e2c8d4bf`
BLAKE2b-256	`0d903bef521cf695e2a452a035f1cb2a7edee8def038c99803b64cdeff966ff7`

See more details on using hashes here.

File details

Details for the file svd_imputer-0.1.1-py3-none-any.whl.

File metadata

Download URL: svd_imputer-0.1.1-py3-none-any.whl
Upload date: Nov 12, 2025
Size: 22.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for svd_imputer-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`752f0c46f78b20d3f302e7ec1b415c21e4a76a2a62eae950829910ab52ad5dad`
MD5	`b20eeb3dbeb01481656d7cb3af4ad58d`
BLAKE2b-256	`08a243a1eeb53493128befd01ebf05b8434633d92b9be2a3936154946dab66a1`

See more details on using hashes here.

svd-imputer 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

SVD Time Series Imputer

Table of Contents

Installation

Quick Start

Usage

Configuration

Examples

How It Works

API Reference

Main Class

Key Methods

Requirements

Performance Notes

Package Status

Disclaimer

License

Contributing

Links

Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes