Skip to main content

Incremental machine learning in Python — learn one observation at a time

Project description

incre-ml

Incremental machine learning in Python. Every algorithm processes one observation at a time — no batches, no retraining, no accumulated history.

from incre_ml.forecasting import HoltWinters

model = HoltWinters(season_length=24)

for x, y in stream:
    prediction = model.predict_one(x)
    model.learn_one(x, y)

Why incre-ml?

Traditional ML libraries require batches. When data arrives continuously — sensor telemetry, financial ticks, patient vitals, API logs — you need models that update incrementally and predict instantly. incre-ml provides a complete ecosystem for this: forecasting, anomaly detection, classification, clustering, drift detection, uncertainty quantification, federated learning, and physics-informed constraints.

Core API contract — every model implements:

Method Purpose
model.learn_one(x, y) Update on one observation
model.predict_one(x) Predict from one observation
model.explain_one(x) Feature contributions for one observation
model.clear() Reset internal state

All models use x: dict[str, Any] for features — sparse, heterogeneous, schema-agnostic by design.

Installation

pip install incre-ml

With optional connectors:

pip install "incre-ml[kafka]"       # Confluent Kafka
pip install "incre-ml[mqtt]"        # MQTT / IoT
pip install "incre-ml[connectors]"  # All connectors
pip install "incre-ml[dashboard]"   # Streamlit demo app

Capabilities

Forecasting

8 forecasters: Naive, Holt-Winters, AR/SNARIMAX, Kalman Filter, RLS, Croston/TSB, Bootstrapped ensembles, and model selection.

from incre_ml.forecasting import BootstrappedRegressor, HoltWinters

model = BootstrappedRegressor(HoltWinters(season_length=24), n_models=5)

pred, uncertainty = model.predict_with_uncertainty(x)
model.learn_one(x, y)

Anomaly Detection

Statistical (Z-score), geometric (Half-Space Trees), and predictive detectors — composable via weighted ensemble voting. Includes CUSUM and EWMA industrial detectors.

from incre_ml.anomaly import AnomalyEnsemble, ZScoreDetector, PredictiveAnomalyDetector
from incre_ml.forecasting import HoltWinters

ensemble = AnomalyEnsemble({
    "stat": ZScoreDetector(feature_name="temperature"),
    "pred": PredictiveAnomalyDetector(
        model=HoltWinters(season_length=96),
        feature_name="temperature",
    ),
})

score = ensemble.score_one({"temperature": 95.2})  # 0.0 (normal) to 1.0 (anomalous)
ensemble.learn_one({"temperature": 95.2})

Streaming Classification

Hoeffding Tree, Logistic Regression, Naive Bayes, SGD, Adaptive Random Forest, and Windowed KNN — all incremental.

from incre_ml.classification import HoeffdingTreeClassifier

clf = HoeffdingTreeClassifier(grace_period=10)

proba = clf.predict_proba_one(x)    # class probabilities
explanation = clf.explain_one(x)     # feature contributions
clf.learn_one(x, y)

Pipelines

Chain transformers and predictors into unified streaming workflows.

from incre_ml.compose import Pipeline
from incre_ml.preprocessing import StandardScaler, SelectKBest

pipe = Pipeline([StandardScaler(), SelectKBest(k=5), model])
pipe.learn_one(x, y)

Drift Detection

ADWIN (exponential histogram, O(log n) memory), statistical detectors, and DriftAdaptiveWrapper for automatic model adaptation via reset, decay, or replacement strategies.

Federated Learning

FederatedEnsemble trains local models per site/region and aggregates via averaging or median — without centralizing raw data.

from incre_ml.federated import FederatedEnsemble
from incre_ml.linear import LinearRegression

fed = FederatedEnsemble(LinearRegression(), ["site_a", "site_b", "site_c"])

fed.learn_one("site_a", x, y)
global_pred = fed.predict_global(x)
fed.sync()  # aggregate local models

Physics-Informed Constraints

Wrap any regressor with domain constraints to prevent physically implausible predictions.

from incre_ml.physics.thermal import NewtonCoolingConstraint
from incre_ml.base.physics import PhysicsInformedWrapper

guard = NewtonCoolingConstraint(k=0.05, ambient_temp=15.0, max_deviation=3.0)
safe_model = PhysicsInformedWrapper(model, guard)

Also Included

  • Clustering — OnlineKMeans, DBSTREAM (density-based stream clustering)
  • Uncertainty — Conformal prediction, adaptive conformal intervals (ACI), bootstrapped wrappers
  • Preprocessing — Welford's scalers, online feature selection, temporal features, encoders
  • Evaluation — Prequential (test-then-train) scoring protocol
  • Active Learning — Uncertainty sampling
  • Explainability — Per-prediction feature contributions
  • Metrics — Online regression and classification metrics
  • Simulation — Synthetic data generators for manufacturing, clinical, demand, finance, traffic, and building scenarios
  • Serving — Production serving utilities
  • I/O — CSV, Kafka, and MQTT connectors
  • Model Selection — Bandit-based AutoML for streaming

Interactive Dashboard

Explore all capabilities through 7 real-world scenarios with live streaming data:

pip install "incre-ml[dashboard]"
streamlit run app.py

Scenarios: Manufacturing quality, supply chain demand, sales anomaly monitoring, clinical triage, connected vehicle safety, API security monitoring, smart building energy.

Design Principles

  • Welford's algorithm everywhere — all statistics use O(1) memory incremental computation
  • Lazy state initialization — internal state created on first learn_one(), not __init__
  • Composition over inheritance — shallow hierarchies (1-2 levels), compose via Pipeline and ensembles
  • Strict typingmypy --strict on all library code
  • Return self from learn_one() — enables method chaining

Development

git clone https://github.com/pespila/incre-ml.git
cd incre-ml
python -m venv venv && source venv/bin/activate
pip install -e ".[dev]"
pre-commit install
ruff check . && ruff format .   # lint + format
mypy src                        # strict type checking
pytest                          # tests with coverage

License

MIT — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

incre_ml-0.1.0.tar.gz (98.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

incre_ml-0.1.0-py3-none-any.whl (108.8 kB view details)

Uploaded Python 3

File details

Details for the file incre_ml-0.1.0.tar.gz.

File metadata

  • Download URL: incre_ml-0.1.0.tar.gz
  • Upload date:
  • Size: 98.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for incre_ml-0.1.0.tar.gz
Algorithm Hash digest
SHA256 cac4c36bb111ff384c35cc862bba67929606293ad3850e4eef25651926176c71
MD5 308f0c1b377e74e726f9c0c69f4d4d6c
BLAKE2b-256 6c8334617b22c5c1cf106ec0697446c9ae3a24971f7dfc5e92e186992f4214e0

See more details on using hashes here.

File details

Details for the file incre_ml-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: incre_ml-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 108.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for incre_ml-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 33a8b81e91337a3e011c32f4fa2d12694f222da85e68d77e0e01ba25e73f5a5f
MD5 26520dd6694f9ba388c91112aef1e1ea
BLAKE2b-256 f1224d92c425626d0a94fe0b986be5e68dde152e04e448528b7d70454d7ed506

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page