Skip to main content

Time Series Description

Project description

TSDSS Logo

TSDSS 📊 🔮 📈

PyPI version Python License: MIT Downloads Build Status

TSDSS is a comprehensive Python package for time series analysis and surrogate data generation. It provides a wide range of tools for statistical analysis, preprocessing, feature extraction, and surrogate data generation for both univariate and multivariate time series.

Features

Time Series Analysis

  • Basic statistics (mean, std, skewness, kurtosis)
  • Stationarity tests (ADF test, Ljung-Box test)
  • Correlation analysis (Pearson, Spearman, Kendall)
  • Spectral analysis
  • Nonlinear analysis (Lyapunov exponent, phase space reconstruction)
  • Entropy measures

Time Series Preprocessing

  • Missing value interpolation
  • Outlier detection
  • Normalization
  • Resampling
  • Feature extraction

Surrogate Data Generation

  • IAAFT (Iterative Amplitude Adjusted Fourier Transform)
  • IAAFT+ (Enhanced IAAFT)
  • IPFT (Iterative Phase-adjusted Fourier Transform)
  • AIAAFT (Adaptive IAAFT)
  • IAAWT (Iterative Amplitude Adjusted Wavelet Transform)
  • Multivariate surrogate methods
  • Bootstrap methods

Time Series Filtering

Each filter has its own characteristics and use cases:

  • Moving Average Filter: Simple and effective for reducing random noise
  • Exponential Filter: Gives more weight to recent data points
  • Savitzky-Golay Filter: Preserves higher moments of the data while smoothing
  • Kalman Filter: Optimal for tracking time-varying signals
  • Butterworth Filter: Frequency domain filtering with flat response
  • Median Filter: Excellent for removing impulse noise and outliers

For multivariate time series, the multivariate_filter function provides a unified interface to apply any of these filters to each dimension of the data. Key features:

  • Supports all single-variable filtering methods
  • Maintains correlations between dimensions
  • Handles errors gracefully for each dimension
  • Preserves the original data structure

Installation

pip install tsdss

Input Data Format

TSDSS accepts the following input formats:

  • NumPy arrays (1D for univariate, 2D for multivariate)
  • Pandas Series (for univariate)
  • Pandas DataFrame (for multivariate)

Example shapes:

  • Univariate: (n_samples,) or (n_samples, 1)
  • Multivariate: (n_samples, n_features)

Quick Start Examples

Basic Statistics and Analysis

import numpy as np
import pandas as pd
from tsdss  import ts_statistics, plot_decomposition, calculate_entropy

# Basic time series statistics
ts = np.random.normal(0, 1, 1000)
stats = ts_statistics(ts)
print(stats)

# Plot time series decomposition
plot_decomposition(ts)

# Calculate entropy
entropy = calculate_entropy(ts)
print(f"Entropy: {entropy}")

Time Series Preprocessing

from tsdss import interpolate_missing, detect_outliers, normalize_ts, resample_ts

# Handle missing values
ts = pd.Series([1, np.nan, 3, np.nan, 5])
ts_clean = interpolate_missing(ts, method='linear')  # Options: linear, ffill, bfill, cubic, spline

# Detect outliers
ts = np.random.normal(0, 1, 1000)
outliers = detect_outliers(ts, method='zscore', threshold=3)  # Options: zscore, iqr, mad

# Normalize data
ts_norm = normalize_ts(ts, method='zscore')  # Options: zscore, minmax, robust

# Resample time series (requires datetime index)
dates = pd.date_range('2023-01-01', periods=100, freq='D')
ts = pd.Series(np.random.randn(100), index=dates)
ts_resampled = resample_ts(ts, freq='W', method='mean')

Feature Extraction

from tsdss import extract_time_features, extract_freq_features

# Extract time domain features
ts = np.random.normal(0, 1, 1000)
time_features = extract_time_features(ts)
print("Time domain features:", time_features)

# Extract frequency domain features
freq_features = extract_freq_features(ts)
print("Frequency domain features:", freq_features)

Correlation Analysis

from tsdss import mutual_information, kendall_correlation

# Calculate mutual information
x = np.random.normal(0, 1, 1000)
y = 0.5 * x + np.random.normal(0, 1, 1000)
mi = mutual_information(x, y)
print(f"Mutual Information: {mi}")

# Calculate Kendall correlation
kendall = kendall_correlation(x, y)
print(f"Kendall Correlation: {kendall}")

Surrogate Data Generation

from tsdss  import (
    iaaft, iaaft_plus, ipft, aiaaft, 
    multivariate_iaaft, block_bootstrap, 
    stationary_bootstrap
)

# Generate univariate surrogate data
ts = np.random.normal(0, 1, 1000)

# IAAFT method
surrogate_iaaft = iaaft(ts, n_iterations=1000, num_surrogates=1)[0]

# IAAFT+ method
surrogate_iaaft_plus = iaaft_plus(ts, n_iterations=1000, num_surrogates=1)[0]

# IPFT method
surrogate_ipft = ipft(ts, n_iterations=1000, num_surrogates=1)[0]

# Generate multivariate surrogate data
data = np.random.normal(0, 1, (1000, 3))  # 3-dimensional time series
mv_surrogate = multivariate_iaaft(data, max_iter=100, num_surrogates=1)[0]

# Bootstrap methods
block_samples = block_bootstrap(ts, block_length=50, num_bootstrap=100)
stat_samples = stationary_bootstrap(ts, mean_block_length=50, num_bootstrap=100)

Wavelet Analysis

from tsdss import dwt, idwt, iaawt

# Perform discrete wavelet transform
ts = np.random.normal(0, 1, 1024)  # Length should be power of 2
coeffs = dwt(ts, level=3)

# Perform inverse wavelet transform
reconstructed = idwt(coeffs)

# Generate wavelet-based surrogate
surrogate = iaawt(ts, n_iterations=1000, num_surrogates=1)[0]

Advanced Multivariate Analysis

from tsdss import (
    mvts_surrogate_s_transform, 
    mvts_surrogate_wavelet,
    mvts_surrogate_pca,
    copula_surrogate
)

# Generate multivariate data
data = np.random.normal(0, 1, (1000, 5))

# Different multivariate surrogate methods
surrogate_st = mvts_surrogate_s_transform(data, num_surrogates=1)[0]
surrogate_wavelet = mvts_surrogate_wavelet(data, num_surrogates=1)[0]
surrogate_pca = mvts_surrogate_pca(data, num_surrogates=1)[0]
surrogate_copula = copula_surrogate(data, num_surrogates=1)[0]

Bootstrap Methods

from tsdss import block_bootstrap, stationary_bootstrap

# 1. Block Bootstrap
# Fixed block length, suitable for data with strong local dependencies
ts = np.random.normal(0, 1, 1000)
block_samples = block_bootstrap(
    data=ts, 
    block_length=50,  # Fixed block length
    num_bootstrap=100
)

# 2. Stationary Bootstrap
# Random block length (geometric distribution), preserves stationarity
stat_samples = stationary_bootstrap(
    data=ts, 
    mean_block_length=50,  # Average block length
    num_bootstrap=100
)

# Compare the two methods
print("Block Bootstrap first sample:", block_samples[0][:10])
print("Stationary Bootstrap first sample:", stat_samples[0][:10])

# Using with pandas Series
ts_series = pd.Series(ts)
block_samples_pd = block_bootstrap(ts_series, block_length=50, num_bootstrap=100)
stat_samples_pd = stationary_bootstrap(ts_series, mean_block_length=50, num_bootstrap=100)

# Key differences:
# 1. Block Bootstrap: Uses fixed block length
# 2. Stationary Bootstrap: Uses random block length (geometric distribution)
#    - Better preserves stationarity
#    - More suitable for time series with varying dependence structures

Time Series Filtering

from tsdss import (
    moving_average_filter,
    exponential_filter,
    savitzky_golay_filter,
    kalman_filter,
    butterworth_filter,
    median_filter,
    multivariate_filter
)
import numpy as np
import matplotlib.pyplot as plt

# 1. Univariate Filtering Example
t = np.linspace(0, 10, 1000)
noisy_signal = np.sin(2*np.pi*0.5*t) + 0.5*np.random.normal(0, 1, 1000)

# Apply different filters
ma_filtered = moving_average_filter(noisy_signal, window_size=5)
ema_filtered = exponential_filter(noisy_signal, alpha=0.3)
sg_filtered = savitzky_golay_filter(noisy_signal, window_size=15, poly_order=3)
kalman_filtered = kalman_filter(noisy_signal, Q=1e-5, R=1e-2)

# 2. Multivariate Filtering Example
# Generate sample multivariate data
mv_data = np.column_stack([
    np.sin(2*np.pi*0.5*t) + 0.5*np.random.normal(0, 1, 1000),
    np.cos(2*np.pi*0.3*t) + 0.3*np.random.normal(0, 1, 1000),
    0.5*t + np.random.normal(0, 0.2, 1000)
])

# Apply multivariate filter
mv_filtered = multivariate_filter(
    mv_data,
    filter_type='kalman',
    Q=1e-5,
    R=1e-2
)

# You can also try different filter types
mv_ma = multivariate_filter(mv_data, filter_type='ma', window_size=5)
mv_butter = multivariate_filter(
    mv_data, 
    filter_type='butter',
    cutoff=0.1,
    fs=100
)

Performance

The package uses optimized C++ implementations for core computations:

  • Trend decomposition
  • Skewness and kurtosis calculation
  • ACF computation
  • Ljung-Box test

Requirements

  • Python >= 3.7
  • NumPy >= 1.19.0
  • Pandas >= 1.0.0
  • SciPy >= 1.6.0
  • Statsmodels >= 0.13.0
  • Scikit-learn >= 0.24.0
  • Matplotlib >= 3.0.0

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Version History

0.2.0

  • Added comprehensive time series filtering functionality
  • Added multivariate filtering support
  • Improved documentation and examples
  • Bug fixes and performance improvements

0.1.0

  • Initial release

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tsdss-0.2.0.tar.gz (27.7 kB view details)

Uploaded Source

Built Distribution

tsdss-0.2.0-cp39-cp39-macosx_10_9_universal2.whl (59.0 kB view details)

Uploaded CPython 3.9 macOS 10.9+ universal2 (ARM64, x86-64)

File details

Details for the file tsdss-0.2.0.tar.gz.

File metadata

  • Download URL: tsdss-0.2.0.tar.gz
  • Upload date:
  • Size: 27.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.6

File hashes

Hashes for tsdss-0.2.0.tar.gz
Algorithm Hash digest
SHA256 75eaa7f4af5d689e8a78cc01934278fd4009819b74e9da94f9ebd2c2735a7b8c
MD5 3d07b9855c485d91184663ae3be3d3b4
BLAKE2b-256 5aa0f5dcfedb7237cd8a9b9c583fadb61d084321cb869b01aca7001dc49dfa77

See more details on using hashes here.

File details

Details for the file tsdss-0.2.0-cp39-cp39-macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for tsdss-0.2.0-cp39-cp39-macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 b205f323733d1aa84e6f68939383130a7b7ec4335a1f3a2f9d5f307434ba5aa4
MD5 4c454371ac5bd3e83a2d8dbcedbbed4a
BLAKE2b-256 288587f51becd06ff505876f1dce4a755ea37d6d51a033ad0918e3ea6ed6eca3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page