Skip to main content

ts-shape filters, transforms and engineers your timeseries dataframe

Project description

ts-shape | Timeseries Shaper

pypi version downloads CI docs License: MIT Python 3.10+

ts-shape is a composable, production-ready Python toolkit for loading, shaping, and analysing industrial timeseries data. Built for manufacturing and IoT, it follows a simple DataFrame-in, DataFrame-out philosophy across loaders, transforms, feature extractors, and event detectors.


Key Features

  • Unified DataFrame workflow -- Load timeseries + metadata, join on uuid, process.
  • Modular packs -- Quality, Production, Engineering, Maintenance, Supply Chain events.
  • Performance-aware -- Vectorised ops, chunked DB reads, concurrent I/O.
  • Zero ML dependencies -- Core uses only pandas, numpy, scipy.
  • Multi-source loaders -- Parquet, S3, Azure Blob, TimescaleDB, REST APIs.

Installation

pip install ts-shape

# Recommended: parquet engine
pip install pyarrow          # or: pip install fastparquet

Optional integrations:

Integration Install
Azure Blob Storage pip install azure-storage-blob
Azure AAD + management pip install azure-identity azure-mgmt-storage
S3 proxy access Included via s3fs
TimescaleDB / PostgreSQL pip install ts-shape[postgres] or any SQLAlchemy-compatible driver

Quick Start

import pandas as pd
from ts_shape.events.quality.outlier_detection import OutlierDetectionEvents
from ts_shape.events.quality.statistical_process_control import StatisticalProcessControlRuleBased

# Load your timeseries data
df = pd.read_parquet("my_data.parquet")

# Detect outliers
detector = OutlierDetectionEvents(df, value_column="value_double")
outliers = detector.detect_outliers_zscore(threshold=3.0)

# Run SPC analysis
spc = StatisticalProcessControlRuleBased(
    df, actual_uuid="sensor:temp", tolerance_uuid="limit:temp"
)
violations = spc.process()

Data Model

ts-shape works with a standardised timeseries DataFrame schema:

Column Type Description
systime datetime64[ns] Timestamp (sorted, tz-aware supported)
uuid str Signal identifier
value_double float64 Numeric values
value_integer int64 Integer values
value_bool bool Boolean values
value_string str String values
is_delta bool Change indicator

All classes inherit from a common Base class that automatically detects time columns, converts to datetime, and sorts by timestamp.


Architecture

ts_shape/
├── loader/              # Data Loading & Integration
│   ├── timeseries/      # Parquet, S3, Azure Blob, TimescaleDB, Energy API
│   ├── metadata/        # JSON, REST API, Database metadata
│   └── combine/         # DataIntegratorHybrid (merge timeseries + metadata)
│
├── transform/           # Data Transformation
│   ├── filter/          # Numeric, String, Boolean, DateTime, Custom filters
│   ├── calculator/      # Arithmetic operations (scale, offset, power, etc.)
│   ├── functions/       # Lambda/callable application
│   └── time_functions/  # Timestamp conversion, timezone operations
│
├── features/            # Feature Extraction
│   ├── stats/           # Numeric, String, Boolean, Timestamp statistics
│   ├── time_stats/      # Time-windowed aggregations
│   └── cycles/          # Cycle detection & processing (6 methods)
│
├── events/              # Event Detection (Domain Packs)
│   ├── quality/         # Outlier detection, SPC (8 rules), tolerance deviation
│   ├── production/      # OEE, machine state, throughput, shift, downtime, alarms, batches
│   ├── engineering/     # Setpoint changes, startup detection, control quality
│   ├── maintenance/     # Degradation, failure prediction, vibration analysis
│   └── supplychain/     # Inventory monitoring, lead time, demand patterns
│
├── context/             # Value mapping (categorical codes → labels)
└── utils/               # Base class and shared utilities

Packs Overview

Quality Events

Detect anomalies and process deviations in sensor data.

from ts_shape.events.quality.outlier_detection import OutlierDetectionEvents
from ts_shape.events.quality.statistical_process_control import StatisticalProcessControlRuleBased
from ts_shape.events.quality.tolerance_deviation import ToleranceDeviationEvents

# Outlier detection (Z-score, IQR, MAD, Isolation Forest)
outliers = OutlierDetectionEvents(df, value_column="value_double")
result = outliers.detect_outliers_zscore(threshold=3.0)

# Statistical Process Control -- 8 Western Electric rules
spc = StatisticalProcessControlRuleBased(df, actual_uuid="sensor", tolerance_uuid="limit")
violations = spc.process()

# Tolerance deviation with severity classification
tol = ToleranceDeviationEvents(df, actual_uuid="sensor", tolerance_uuid="limit")
deviations = tol.process_and_group_data_with_events()

Production Events

Track production performance, equipment states, and operational metrics.

from ts_shape.events.production.machine_state import MachineStateEvents
from ts_shape.events.production.oee_calculator import OEECalculator
from ts_shape.events.production.shift_reporting import ShiftReporting
from ts_shape.events.production.alarm_management import AlarmManagementEvents
from ts_shape.events.production.batch_tracking import BatchTrackingEvents

# Machine state detection (run/idle intervals)
mse = MachineStateEvents(df, run_state_uuid="machine:running")
intervals = mse.detect_run_idle(min_duration="30s")

# OEE calculation (Availability x Performance x Quality)
oee = OEECalculator(df)
result = oee.calculate_oee(
    run_state_uuid="machine:state",
    counter_uuid="parts:count",
    ideal_cycle_time=10.0,
)

# Alarm analysis (ISA-18.2 style)
alarms = AlarmManagementEvents(df, alarm_uuid="alarm:overtemp")
chattering = alarms.chattering_detection(min_transitions=5, window="10m")

# Batch tracking
batches = BatchTrackingEvents(df, batch_uuid="batch:id")
batch_list = batches.detect_batches()

Engineering Events

Analyse control system behaviour and setpoint responses.

from ts_shape.events.engineering.setpoint_events import SetpointChangeEvents
from ts_shape.events.engineering.startup_events import StartupDetectionEvents

# Setpoint change detection + settling time + overshoot
sp = SetpointChangeEvents(df, setpoint_uuid="setpoint:temp")
steps = sp.detect_setpoint_steps(min_delta=2.0)
settle = sp.time_to_settle(actual_uuid="actual:temp", tol=0.5)
quality = sp.control_quality_metrics(actual_uuid="actual:temp")

# Startup detection
startup = StartupDetectionEvents(df, signal_uuid="motor:speed")
events = startup.detect_startup_by_threshold(threshold=100.0)

Maintenance Events

Predictive maintenance through degradation detection and failure prediction.

from ts_shape.events.maintenance.degradation_detection import DegradationDetectionEvents
from ts_shape.events.maintenance.failure_prediction import FailurePredictionEvents
from ts_shape.events.maintenance.vibration_analysis import VibrationAnalysisEvents

# Degradation detection (trend, variance, level shift, health score)
deg = DegradationDetectionEvents(df, signal_uuid="sensor:bearing_temp")
trends = deg.detect_trend_degradation(window="1h", direction="increasing")
health = deg.health_score(window="1h", baseline_window="24h")

# Remaining Useful Life estimation
fp = FailurePredictionEvents(df, signal_uuid="sensor:bearing_temp")
rul = fp.remaining_useful_life(degradation_rate=0.01, failure_threshold=120.0)

# Vibration analysis (RMS, crest factor, kurtosis)
vib = VibrationAnalysisEvents(df, signal_uuid="sensor:vibration")
indicators = vib.bearing_health_indicators(window="5m")

Supply Chain Events

Monitor inventory, lead times, and demand patterns.

from ts_shape.events.supplychain.inventory_monitoring import InventoryMonitoringEvents
from ts_shape.events.supplychain.lead_time_analysis import LeadTimeAnalysisEvents
from ts_shape.events.supplychain.demand_pattern import DemandPatternEvents

# Inventory monitoring with stockout prediction
inv = InventoryMonitoringEvents(df, level_uuid="inventory:raw_material")
low_stock = inv.detect_low_stock(min_level=100, hold="30m")
prediction = inv.stockout_prediction(consumption_rate_window="4h")

# Lead time analysis
lt = LeadTimeAnalysisEvents(df)
lead_times = lt.calculate_lead_times(order_uuid="order:placed", delivery_uuid="order:delivered")
anomalies = lt.detect_lead_time_anomalies(order_uuid="order:placed", delivery_uuid="order:delivered")

# Demand patterns and seasonality
demand = DemandPatternEvents(df, demand_uuid="demand:daily")
spikes = demand.detect_demand_spikes(threshold_factor=2.0)
seasonal = demand.seasonality_summary(period="1D")

Loaders

Load data from multiple sources into the standard schema.

from ts_shape.loader.timeseries.parquet_loader import ParquetLoader
from ts_shape.loader.timeseries.azure_blob_loader import AzureBlobParquetLoader
from ts_shape.loader.metadata.metadata_json_loader import MetadataJsonLoader
from ts_shape.loader.combine.integrator import DataIntegratorHybrid

# Load parquet files
df = ParquetLoader.load_all_files("/data/timeseries")
df_range = ParquetLoader.load_by_time_range("/data/timeseries", start, end)

# Load metadata and combine
meta = MetadataJsonLoader.from_file("metadata.json")
combined = DataIntegratorHybrid.combine_data(timeseries_df=df, metadata_df=meta.to_df())

Features & Statistics

Extract statistical features and detect cycles.

from ts_shape.features.stats.numeric_stats import NumericStatistics
from ts_shape.features.stats.time_stats_numeric import TimeGroupedStatistics
from ts_shape.features.cycles.cycles_extractor import CycleExtractor

# Descriptive statistics
stats = NumericStatistics.summary_as_dict(df, "value_double")

# Time-windowed aggregations
tgs = TimeGroupedStatistics(df, value_column="value_double")
hourly = tgs.calculate_statistic(freq="1h", stat="mean")

# Cycle extraction (6 detection methods)
extractor = CycleExtractor(df, start_uuid="cycle:trigger")
cycles = extractor.process_persistent_cycle()

Development

# Clone and install in development mode
git clone https://github.com/jakobgabriel/ts-shape.git
cd ts-shape
pip install -e ".[dev]"

# Run tests
pytest tests/ -v

# Run tests with coverage
pytest tests/ --cov=ts_shape --cov-report=term-missing

# Build documentation
pip install -r requirements-docs.txt
mkdocs serve

CI/CD

The project uses GitHub Actions for continuous integration and deployment:

Workflow Trigger Description
CI Push / PR Runs tests on Python 3.10, 3.11, 3.12
Release Push to main / Tag v* Build docs, deploy to GitHub Pages, publish to PyPI

Versioning is managed with setuptools-scm -- version numbers are derived automatically from git tags. To release:

git tag v0.2.0
git push origin v0.2.0

Project Structure

ts-shape/
├── src/ts_shape/           # Library source code
├── tests/                  # pytest test suite (100+ tests)
├── examples/               # Runnable demo scripts
├── docs/                   # MkDocs documentation
├── .github/workflows/      # CI/CD pipelines
├── pyproject.toml          # Package configuration + auto-versioning
├── setup.py                # Legacy setup (delegates to pyproject.toml)
├── requirements.txt        # Runtime dependencies
└── requirements-docs.txt   # Documentation dependencies

Contributing

Contributions are welcome! Please see docs/contributing.md for guidelines.

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/my-feature)
  3. Write tests for your changes
  4. Ensure all tests pass (pytest tests/ -v)
  5. Submit a pull request

License

MIT -- see LICENSE.txt.


Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ts_shape-0.0.33.tar.gz (331.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ts_shape-0.0.33-py3-none-any.whl (331.3 kB view details)

Uploaded Python 3

File details

Details for the file ts_shape-0.0.33.tar.gz.

File metadata

  • Download URL: ts_shape-0.0.33.tar.gz
  • Upload date:
  • Size: 331.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ts_shape-0.0.33.tar.gz
Algorithm Hash digest
SHA256 4a54971452c5bdacc528ac1f89447ceb31e9182a65734efdd99aad96ce52f7d6
MD5 21692133cee3e80a1c9555daff7a0e36
BLAKE2b-256 409e5ab8154d32054515616931f72eb354cfd445ebc522a7024cd4894f8dc249

See more details on using hashes here.

File details

Details for the file ts_shape-0.0.33-py3-none-any.whl.

File metadata

  • Download URL: ts_shape-0.0.33-py3-none-any.whl
  • Upload date:
  • Size: 331.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ts_shape-0.0.33-py3-none-any.whl
Algorithm Hash digest
SHA256 c2b4a872934b7dc11e425236788be137adc14595ab698a2cb342bced50e52136
MD5 d9761ad6c012344a26f60219865d2799
BLAKE2b-256 e6f5e7bcebad8314a07305bff299b7eabb6094285c5ed00b5273d4eedbb80ba6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page