Incremental machine learning in Python — learn one observation at a time
Project description
incre-ml
Incremental machine learning in Python. Every algorithm processes one observation at a time — no batches, no retraining, no accumulated history.
from incre_ml.forecasting import HoltWinters
model = HoltWinters(season_length=24)
for x, y in stream:
prediction = model.predict_one(x)
model.learn_one(x, y)
Why incre-ml?
Traditional ML libraries require batches. When data arrives continuously — sensor telemetry, financial ticks, patient vitals, API logs — you need models that update incrementally and predict instantly. incre-ml provides a complete ecosystem for this: forecasting, anomaly detection, classification, clustering, drift detection, uncertainty quantification, federated learning, and physics-informed constraints.
Core API contract — every model implements:
| Method | Purpose |
|---|---|
model.learn_one(x, y) |
Update on one observation |
model.predict_one(x) |
Predict from one observation |
model.explain_one(x) |
Feature contributions for one observation |
model.clear() |
Reset internal state |
All models use x: dict[str, Any] for features — sparse, heterogeneous, schema-agnostic by design.
Installation
pip install incre-ml
With optional connectors:
pip install "incre-ml[kafka]" # Confluent Kafka
pip install "incre-ml[mqtt]" # MQTT / IoT
pip install "incre-ml[connectors]" # All connectors
pip install "incre-ml[dashboard]" # Streamlit demo app
Capabilities
Forecasting
8 forecasters: Naive, Holt-Winters, AR/SNARIMAX, Kalman Filter, RLS, Croston/TSB, Bootstrapped ensembles, and model selection.
from incre_ml.forecasting import BootstrappedRegressor, HoltWinters
model = BootstrappedRegressor(HoltWinters(season_length=24), n_models=5)
pred, uncertainty = model.predict_with_uncertainty(x)
model.learn_one(x, y)
Anomaly Detection
Statistical (Z-score), geometric (Half-Space Trees), and predictive detectors — composable via weighted ensemble voting. Includes CUSUM and EWMA industrial detectors.
from incre_ml.anomaly import AnomalyEnsemble, ZScoreDetector, PredictiveAnomalyDetector
from incre_ml.forecasting import HoltWinters
ensemble = AnomalyEnsemble({
"stat": ZScoreDetector(feature_name="temperature"),
"pred": PredictiveAnomalyDetector(
model=HoltWinters(season_length=96),
feature_name="temperature",
),
})
score = ensemble.score_one({"temperature": 95.2}) # 0.0 (normal) to 1.0 (anomalous)
ensemble.learn_one({"temperature": 95.2})
Streaming Classification
Hoeffding Tree, Logistic Regression, Naive Bayes, SGD, Adaptive Random Forest, and Windowed KNN — all incremental.
from incre_ml.classification import HoeffdingTreeClassifier
clf = HoeffdingTreeClassifier(grace_period=10)
proba = clf.predict_proba_one(x) # class probabilities
explanation = clf.explain_one(x) # feature contributions
clf.learn_one(x, y)
Pipelines
Chain transformers and predictors into unified streaming workflows.
from incre_ml.compose import Pipeline
from incre_ml.preprocessing import StandardScaler, SelectKBest
pipe = Pipeline([StandardScaler(), SelectKBest(k=5), model])
pipe.learn_one(x, y)
Drift Detection
ADWIN (exponential histogram, O(log n) memory), statistical detectors, and DriftAdaptiveWrapper for automatic model adaptation via reset, decay, or replacement strategies.
Federated Learning
FederatedEnsemble trains local models per site/region and aggregates via averaging or median — without centralizing raw data.
from incre_ml.federated import FederatedEnsemble
from incre_ml.linear import LinearRegression
fed = FederatedEnsemble(LinearRegression(), ["site_a", "site_b", "site_c"])
fed.learn_one("site_a", x, y)
global_pred = fed.predict_global(x)
fed.sync() # aggregate local models
Physics-Informed Constraints
Wrap any regressor with domain constraints to prevent physically implausible predictions.
from incre_ml.physics.thermal import NewtonCoolingConstraint
from incre_ml.base.physics import PhysicsInformedWrapper
guard = NewtonCoolingConstraint(k=0.05, ambient_temp=15.0, max_deviation=3.0)
safe_model = PhysicsInformedWrapper(model, guard)
Also Included
- Clustering — OnlineKMeans, DBSTREAM (density-based stream clustering)
- Uncertainty — Conformal prediction, adaptive conformal intervals (ACI), bootstrapped wrappers
- Preprocessing — Welford's scalers, online feature selection, temporal features, encoders
- Evaluation — Prequential (test-then-train) scoring protocol
- Active Learning — Uncertainty sampling
- Explainability — Per-prediction feature contributions
- Metrics — Online regression and classification metrics
- Simulation — Synthetic data generators for manufacturing, clinical, demand, finance, traffic, building, and plant workforce scenarios
- Serving — Production serving utilities
- I/O — CSV, Kafka, and MQTT connectors
- Model Selection — Bandit-based AutoML for streaming
Interactive Dashboard
Explore all capabilities through 8 real-world scenarios with live streaming data:
pip install "incre-ml[dashboard]"
streamlit run app.py
Scenarios: Manufacturing quality, supply chain demand, sales anomaly monitoring, clinical triage, connected vehicle safety, API security monitoring, smart building energy, and plant workforce orchestration.
Plant Workforce Orchestration
The workforce scenario demonstrates closed-loop control for a multi-line manufacturing plant. When a production line degrades or goes down, the system detects the failure via online anomaly detection (CUSUM + Z-Score), selects a labor reallocation strategy using an epsilon-greedy bandit that learns from throughput-vs-cost rewards, and redistributes workers across remaining lines — all incrementally, one 15-minute tick at a time. Rush orders, quality crises, and sudden equipment failures trigger real-time priority shifts with cost impact tracking.
from incre_ml.simulation.generators import PlantWorkforceGenerator
gen = PlantWorkforceGenerator(total_steps=1200, total_workers=48)
for obs in gen:
# obs contains per-line health, throughput, quality, crew, status
# plus plant-level shift, rush orders, and worker pool
print(obs["A_health"], obs["B_status"], obs["rush_order"])
Design Principles
- Welford's algorithm everywhere — all statistics use O(1) memory incremental computation
- Lazy state initialization — internal state created on first
learn_one(), not__init__ - Composition over inheritance — shallow hierarchies (1-2 levels), compose via Pipeline and ensembles
- Strict typing —
mypy --stricton all library code - Return
selffromlearn_one()— enables method chaining
Development
git clone https://github.com/pespila/incre-ml.git
cd incre-ml
python -m venv venv && source venv/bin/activate
pip install -e ".[dev]"
pre-commit install
ruff check . && ruff format . # lint + format
mypy src # strict type checking
pytest # tests with coverage
License
MIT — see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file incre_ml-0.2.0.tar.gz.
File metadata
- Download URL: incre_ml-0.2.0.tar.gz
- Upload date:
- Size: 107.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3238d692ff8961bc649571f94a21bbf0b342fbc1f5cad01d5715b8d15bea48a3
|
|
| MD5 |
d2644a61858989261e18df93202f7c2a
|
|
| BLAKE2b-256 |
401d1792629872051f6598ad95a17a056b8e4e7ae98ccbea451dd7976b504807
|
File details
Details for the file incre_ml-0.2.0-py3-none-any.whl.
File metadata
- Download URL: incre_ml-0.2.0-py3-none-any.whl
- Upload date:
- Size: 111.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
038d4abb1900990e5e400d04142a7e38af233211d117fb4e7ee0d00cc7ff651e
|
|
| MD5 |
3c0ef4217592a9e8837af73bfa4c7e86
|
|
| BLAKE2b-256 |
979295b03c849b08033ffa7b70aba4b8844d0a69d4a136fc0cd64174633ca713
|