Skip to main content

ts-shape filters, transforms and engineers your timeseries dataframe

Project description

ts-shape | Timeseries Shaper

pypi version downloads CI docs License: MIT Python 3.10+

ts-shape is a composable, production-ready Python toolkit for loading, shaping, and analysing industrial timeseries data. Built for manufacturing and IoT, it follows a simple DataFrame-in, DataFrame-out philosophy across loaders, transforms, feature extractors, and event detectors.


Key Features

  • Unified DataFrame workflow -- Load timeseries + metadata, join on uuid, process.
  • Modular packs -- Quality, Production, Engineering, Maintenance, Supply Chain events.
  • Performance-aware -- Vectorised ops, chunked DB reads, concurrent I/O.
  • Zero ML dependencies -- Core uses only pandas, numpy, scipy.
  • Multi-source loaders -- Parquet, S3, Azure Blob, TimescaleDB, REST APIs.

Installation

pip install ts-shape

# Recommended: parquet engine
pip install pyarrow          # or: pip install fastparquet

Optional integrations:

Integration Install
Azure Blob Storage pip install azure-storage-blob
Azure AAD + management pip install azure-identity azure-mgmt-storage
S3 proxy access Included via s3fs
TimescaleDB / PostgreSQL pip install ts-shape[postgres] or any SQLAlchemy-compatible driver

Quick Start

import pandas as pd
from ts_shape.events.quality.outlier_detection import OutlierDetectionEvents
from ts_shape.events.quality.statistical_process_control import StatisticalProcessControlRuleBased

# Load your timeseries data
df = pd.read_parquet("my_data.parquet")

# Detect outliers
detector = OutlierDetectionEvents(df, value_column="value_double")
outliers = detector.detect_outliers_zscore(threshold=3.0)

# Run SPC analysis
spc = StatisticalProcessControlRuleBased(
    df, actual_uuid="sensor:temp", tolerance_uuid="limit:temp"
)
violations = spc.process()

Data Model

ts-shape works with a standardised timeseries DataFrame schema:

Column Type Description
systime datetime64[ns] Timestamp (sorted, tz-aware supported)
uuid str Signal identifier
value_double float64 Numeric values
value_integer int64 Integer values
value_bool bool Boolean values
value_string str String values
is_delta bool Change indicator

All classes inherit from a common Base class that automatically detects time columns, converts to datetime, and sorts by timestamp.


Architecture

ts_shape/
├── loader/              # Data Loading & Integration
│   ├── timeseries/      # Parquet, S3, Azure Blob, TimescaleDB, Energy API
│   ├── metadata/        # JSON, REST API, Database metadata
│   └── combine/         # DataIntegratorHybrid (merge timeseries + metadata)
│
├── transform/           # Data Transformation
│   ├── filter/          # Numeric, String, Boolean, DateTime, Custom filters
│   ├── calculator/      # Arithmetic operations (scale, offset, power, etc.)
│   ├── functions/       # Lambda/callable application
│   └── time_functions/  # Timestamp conversion, timezone operations
│
├── features/            # Feature Extraction
│   ├── stats/           # Numeric, String, Boolean, Timestamp statistics
│   ├── time_stats/      # Time-windowed aggregations
│   └── cycles/          # Cycle detection & processing (6 methods)
│
├── events/              # Event Detection (Domain Packs)
│   ├── quality/         # Outlier detection, SPC (8 rules), tolerance deviation
│   ├── production/      # OEE, machine state, throughput, shift, downtime, alarms, batches
│   ├── engineering/     # Setpoint changes, startup detection, control quality
│   ├── maintenance/     # Degradation, failure prediction, vibration analysis
│   └── supplychain/     # Inventory monitoring, lead time, demand patterns
│
├── context/             # Value mapping (categorical codes → labels)
└── utils/               # Base class and shared utilities

Packs Overview

Quality Events

Detect anomalies and process deviations in sensor data.

from ts_shape.events.quality.outlier_detection import OutlierDetectionEvents
from ts_shape.events.quality.statistical_process_control import StatisticalProcessControlRuleBased
from ts_shape.events.quality.tolerance_deviation import ToleranceDeviationEvents

# Outlier detection (Z-score, IQR, MAD, Isolation Forest)
outliers = OutlierDetectionEvents(df, value_column="value_double")
result = outliers.detect_outliers_zscore(threshold=3.0)

# Statistical Process Control -- 8 Western Electric rules
spc = StatisticalProcessControlRuleBased(df, actual_uuid="sensor", tolerance_uuid="limit")
violations = spc.process()

# Tolerance deviation with severity classification
tol = ToleranceDeviationEvents(df, actual_uuid="sensor", tolerance_uuid="limit")
deviations = tol.process_and_group_data_with_events()

Production Events

Track production performance, equipment states, and operational metrics.

from ts_shape.events.production.machine_state import MachineStateEvents
from ts_shape.events.production.oee_calculator import OEECalculator
from ts_shape.events.production.shift_reporting import ShiftReporting
from ts_shape.events.production.alarm_management import AlarmManagementEvents
from ts_shape.events.production.batch_tracking import BatchTrackingEvents

# Machine state detection (run/idle intervals)
mse = MachineStateEvents(df, run_state_uuid="machine:running")
intervals = mse.detect_run_idle(min_duration="30s")

# OEE calculation (Availability x Performance x Quality)
oee = OEECalculator(df)
result = oee.calculate_oee(
    run_state_uuid="machine:state",
    counter_uuid="parts:count",
    ideal_cycle_time=10.0,
)

# Alarm analysis (ISA-18.2 style)
alarms = AlarmManagementEvents(df, alarm_uuid="alarm:overtemp")
chattering = alarms.chattering_detection(min_transitions=5, window="10m")

# Batch tracking
batches = BatchTrackingEvents(df, batch_uuid="batch:id")
batch_list = batches.detect_batches()

Engineering Events

Analyse control system behaviour and setpoint responses.

from ts_shape.events.engineering.setpoint_events import SetpointChangeEvents
from ts_shape.events.engineering.startup_events import StartupDetectionEvents

# Setpoint change detection + settling time + overshoot
sp = SetpointChangeEvents(df, setpoint_uuid="setpoint:temp")
steps = sp.detect_setpoint_steps(min_delta=2.0)
settle = sp.time_to_settle(actual_uuid="actual:temp", tol=0.5)
quality = sp.control_quality_metrics(actual_uuid="actual:temp")

# Startup detection
startup = StartupDetectionEvents(df, signal_uuid="motor:speed")
events = startup.detect_startup_by_threshold(threshold=100.0)

Maintenance Events

Predictive maintenance through degradation detection and failure prediction.

from ts_shape.events.maintenance.degradation_detection import DegradationDetectionEvents
from ts_shape.events.maintenance.failure_prediction import FailurePredictionEvents
from ts_shape.events.maintenance.vibration_analysis import VibrationAnalysisEvents

# Degradation detection (trend, variance, level shift, health score)
deg = DegradationDetectionEvents(df, signal_uuid="sensor:bearing_temp")
trends = deg.detect_trend_degradation(window="1h", direction="increasing")
health = deg.health_score(window="1h", baseline_window="24h")

# Remaining Useful Life estimation
fp = FailurePredictionEvents(df, signal_uuid="sensor:bearing_temp")
rul = fp.remaining_useful_life(degradation_rate=0.01, failure_threshold=120.0)

# Vibration analysis (RMS, crest factor, kurtosis)
vib = VibrationAnalysisEvents(df, signal_uuid="sensor:vibration")
indicators = vib.bearing_health_indicators(window="5m")

Supply Chain Events

Monitor inventory, lead times, and demand patterns.

from ts_shape.events.supplychain.inventory_monitoring import InventoryMonitoringEvents
from ts_shape.events.supplychain.lead_time_analysis import LeadTimeAnalysisEvents
from ts_shape.events.supplychain.demand_pattern import DemandPatternEvents

# Inventory monitoring with stockout prediction
inv = InventoryMonitoringEvents(df, level_uuid="inventory:raw_material")
low_stock = inv.detect_low_stock(min_level=100, hold="30m")
prediction = inv.stockout_prediction(consumption_rate_window="4h")

# Lead time analysis
lt = LeadTimeAnalysisEvents(df)
lead_times = lt.calculate_lead_times(order_uuid="order:placed", delivery_uuid="order:delivered")
anomalies = lt.detect_lead_time_anomalies(order_uuid="order:placed", delivery_uuid="order:delivered")

# Demand patterns and seasonality
demand = DemandPatternEvents(df, demand_uuid="demand:daily")
spikes = demand.detect_demand_spikes(threshold_factor=2.0)
seasonal = demand.seasonality_summary(period="1D")

Loaders

Load data from multiple sources into the standard schema.

from ts_shape.loader.timeseries.parquet_loader import ParquetLoader
from ts_shape.loader.timeseries.azure_blob_loader import AzureBlobParquetLoader
from ts_shape.loader.metadata.metadata_json_loader import MetadataJsonLoader
from ts_shape.loader.combine.integrator import DataIntegratorHybrid

# Load parquet files
df = ParquetLoader.load_all_files("/data/timeseries")
df_range = ParquetLoader.load_by_time_range("/data/timeseries", start, end)

# Load metadata and combine
meta = MetadataJsonLoader.from_file("metadata.json")
combined = DataIntegratorHybrid.combine_data(timeseries_df=df, metadata_df=meta.to_df())

Features & Statistics

Extract statistical features and detect cycles.

from ts_shape.features.stats.numeric_stats import NumericStatistics
from ts_shape.features.stats.time_stats_numeric import TimeGroupedStatistics
from ts_shape.features.cycles.cycles_extractor import CycleExtractor

# Descriptive statistics
stats = NumericStatistics.summary_as_dict(df, "value_double")

# Time-windowed aggregations
tgs = TimeGroupedStatistics(df, value_column="value_double")
hourly = tgs.calculate_statistic(freq="1h", stat="mean")

# Cycle extraction (6 detection methods)
extractor = CycleExtractor(df, start_uuid="cycle:trigger")
cycles = extractor.process_persistent_cycle()

Development

# Clone and install in development mode
git clone https://github.com/jakobgabriel/ts-shape.git
cd ts-shape
pip install -e ".[dev]"

# Run tests
pytest tests/ -v

# Run tests with coverage
pytest tests/ --cov=ts_shape --cov-report=term-missing

# Build documentation
pip install -r requirements-docs.txt
mkdocs serve

CI/CD

The project uses GitHub Actions for continuous integration and deployment:

Workflow Trigger Description
CI Push / PR Runs tests on Python 3.10, 3.11, 3.12
Release Push to main / Tag v* Build docs, deploy to GitHub Pages, publish to PyPI

Versioning is managed with setuptools-scm -- version numbers are derived automatically from git tags. To release:

git tag v0.2.0
git push origin v0.2.0

Project Structure

ts-shape/
├── src/ts_shape/           # Library source code
├── tests/                  # pytest test suite (100+ tests)
├── examples/               # Runnable demo scripts
├── docs/                   # MkDocs documentation
├── .github/workflows/      # CI/CD pipelines
├── pyproject.toml          # Package configuration + auto-versioning
├── setup.py                # Legacy setup (delegates to pyproject.toml)
├── requirements.txt        # Runtime dependencies
└── requirements-docs.txt   # Documentation dependencies

Contributing

Contributions are welcome! Please see docs/contributing.md for guidelines.

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/my-feature)
  3. Write tests for your changes
  4. Ensure all tests pass (pytest tests/ -v)
  5. Submit a pull request

License

MIT -- see LICENSE.txt.


Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ts_shape-0.0.26.tar.gz (291.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ts_shape-0.0.26-py3-none-any.whl (290.0 kB view details)

Uploaded Python 3

File details

Details for the file ts_shape-0.0.26.tar.gz.

File metadata

  • Download URL: ts_shape-0.0.26.tar.gz
  • Upload date:
  • Size: 291.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ts_shape-0.0.26.tar.gz
Algorithm Hash digest
SHA256 b2d2af689d8abebb3cc6512b021debb71cfe36a896ba0ccc14ec1e7b83ecc631
MD5 2bbde325d9464fc05b7532158cf454ca
BLAKE2b-256 22df88e022426769ced8ef8201de897b8b0fb3e47d922b7b03d8b9fa0256c485

See more details on using hashes here.

File details

Details for the file ts_shape-0.0.26-py3-none-any.whl.

File metadata

  • Download URL: ts_shape-0.0.26-py3-none-any.whl
  • Upload date:
  • Size: 290.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ts_shape-0.0.26-py3-none-any.whl
Algorithm Hash digest
SHA256 b56eeb27134f978843c3497d1224d444298f1a7eab721ff668b7cc648a09980c
MD5 0134519e31ca24030396aa0f26a60bd3
BLAKE2b-256 5e4d59d689562eef04c632085f2ec51863a2b38605badadd7fd3a2bf7d6aac10

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page