Skip to main content

OpenMOA: Efficient Machine Learning for Data Streams in Python (Fork of CapyMOA).

Project description

OpenMOA

Banner Image

PyPi Version Join the Discord Documentation GitHub

A unified Python library for Utilitarian Online Learning in dynamic feature spaces.

Documentation · Tutorials · Discord · Report an Issue


What is OpenMOA?

Real-world data streams rarely stay the same. Sensors fail, features appear and disappear, and the world keeps changing — yet most online learning libraries assume a fixed feature space. OpenMOA is built for the messy reality.

OpenMOA is a Python library for Utilitarian Online Learning (UOL): online learning under dynamic, evolving feature spaces. It provides:

  • 10 state-of-the-art UOL algorithms — from sparse linear models to deep graph neural networks
  • 5 stream wrappers — simulate every major type of feature-space evolution
  • Full integration with CapyMOA — access 30+ classic stream learners, 12 drift detectors, and 40+ datasets out of the box
  • Clean, consistent API — train, predict, and evaluate with the same interface across all algorithms

⚠️ Early Development OpenMOA is actively developed and the API may change before v1.0.0. If you run into issues, please open a GitHub Issue or reach out on Discord.


Installation

OpenMOA requires Java (for MOA backend) and PyTorch (for deep learning algorithms).

# 1. Check Java is installed
java -version

# 2. Install PyTorch (CPU version)
pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu

# 3. Install OpenMOA
pip install openmoa

# 4. Verify
python -c "import openmoa; print(openmoa.__version__)"

Having trouble? See the full installation guide for GPU support and platform-specific instructions.


Quick Start

from openmoa.datasets import Electricity
from openmoa.classifier import AdaptiveRandomForestClassifier
from openmoa.evaluation import prequential_evaluation

stream = Electricity()
learner = AdaptiveRandomForestClassifier(schema=stream.get_schema())

results = prequential_evaluation(stream, learner, window_size=1000)

print(f"Accuracy:  {results.cumulative.accuracy():.2f}%")
print(f"Kappa:     {results.cumulative.kappa():.4f}")
print(f"Wall time: {results.wallclock():.2f}s")

Running a UOL algorithm on an evolving feature stream is just as straightforward:

from openmoa.stream import OpenFeatureStream
from openmoa.datasets import Electricity
from openmoa.classifier import OASFClassifier
from openmoa.evaluation import prequential_evaluation

# Simulate a pyramid feature-space evolution
stream = OpenFeatureStream(
    base_stream=Electricity(),
    evolution_pattern="pyramid",
    d_min=2,
    d_max=8,
    total_instances=45312,
)

learner = OASFClassifier(schema=stream.get_schema())
results = prequential_evaluation(stream, learner)
print(f"Accuracy: {results.cumulative.accuracy():.2f}%")

UOL Algorithms

OpenMOA introduces 10 original algorithms designed specifically for dynamic feature spaces. Each handles the challenge of features appearing, disappearing, or changing over time.

Algorithm Type Task Key Idea
FESL Linear ensemble Classification / Regression Detects feature-space shifts via Jaccard distance; learns a mapping matrix between old and new spaces
OASF Sparse linear Classification / Regression Passive-Aggressive updates with L₁,₂-norm group sparsity on a ring-buffer weight matrix
RSOL Sparse linear Classification Robust variant of OASF with stronger sparsity and larger sliding window
FOBOS Proximal SGD Classification Forward-Backward Splitting with L1/L2/elastic-net regularization and adaptive step decay
FTRL Adaptive linear Classification Follow-the-Regularized-Leader with per-coordinate learning rates and L1 sparsification
OVFM Copula + SGD Classification Gaussian Copula EM for mixed continuous/ordinal data; dual observed+latent classifier ensemble
OSLMF Semi-supervised Classification Extends OVFM with Density-Peak Clustering for label propagation on unlabeled instances
ORF3V Stump forests Classification Per-feature decision stump forests with Hoeffding-bound pruning; naturally handles feature appearance/disappearance
OLD3S VAE + HBP Classification Lifelong learning via VAE feature extraction and Hedge Backpropagation MLP; knowledge distillation during transitions
OWSS Graph neural net Classification Bipartite instance-feature GNN with learnable feature embeddings and reconstruction alignment loss
from openmoa.classifier import (
    FESLClassifier, OASFClassifier, RSOLClassifier,
    FOBOSClassifier, FTRLClassifier, OVFMClassifier,
    OSLMFClassifier, ORF3VClassifier, OLD3SClassifier, OWSSClassifier,
)

Stream Wrappers

Not sure which feature-evolution pattern fits your application? OpenMOA's stream wrappers let you simulate five distinct paradigms on any existing dataset:

Wrapper Pattern Description
OpenFeatureStream Pyramid / Incremental / Decremental / TDS / CDS / EDS General-purpose wrapper; shrinks the active feature vector and attaches feature_indices to each instance
TrapezoidalStream Trapezoidal Fixed-size vectors with NaN masking for inactive features
CapriciousStream Random missingness Each feature independently absent with probability missing_ratio per step
EvolvableStream Sequential partitions Features rotate across n_segments groups with configurable overlap windows
ShuffledStream Shuffled order Buffers the entire stream and serves instances in a randomized sequence
from openmoa.stream import (
    OpenFeatureStream, TrapezoidalStream,
    CapriciousStream, EvolvableStream, ShuffledStream,
)

Benchmark

Our benchmark evaluates ten representative UOL and OL algorithms across 12 binary and 9 multi-class datasets under three dynamic feature-space paradigms. All experiments use OpenMOA's unified API with standardized prequential evaluation, ensuring fair and reproducible comparison.

The following performance results are from the latest code review and optimization pass (Windows 11, Intel CPU, Python 3.13, NumPy 2.x):

Optimization Scenario Before After Speedup
DensityPeaks vectorization n = 50 instances 0.151 ms 0.047 ms 3.2×
DensityPeaks vectorization n = 200 instances (default) 1.492 ms 1.109 ms 1.3×
ECDF → np.searchsorted 10 observations 0.0146 ms 0.0042 ms 3.5×
ECDF → np.searchsorted 100 observations (batch) 0.0154 ms 0.0038 ms 4.0×
np.arraynp.asarray float64, no conversion needed 0.30 µs 0.07 µs 4.2×
HBP weight update (OLD3S) 3-layer MLP 0.0269 ms 0.0187 ms 1.4×

End-to-end throughput (400 training steps, d = 8 features):

Classifier Total time Per instance
OSLMF (batch = 50) 51.0 ms 0.128 ms
OVFM 31.8 ms 0.080 ms

Benchmark code: demo/demo_fesl_benchmark_binary.py


What Else is Inside?

OpenMOA is built on top of CapyMOA and inherits its full ecosystem:

  • 30+ classic stream classifiers — Hoeffding Tree, ARF, EFDT, Naive Bayes, kNN, and more
  • 12 concept drift detectors — ADWIN, DDM, HDDM, Page-Hinkley, ABCD (multivariate), and more
  • Regression support — FIMTDD, ARFFIMTDD, SOKNL, ORTO, FESLRegressor, OASFRegressor
  • Semi-supervised evaluationprequential_ssl_evaluation with configurable label probability
  • Online Continual Learningocl_train_eval_loop with forward/backward transfer metrics
  • 40+ benchmark datasets — Electricity, Covtype, RCV1, Fried, Bike, and more

Cite Us

If you use OpenMOA in your research, please cite:

@misc{ZhiliWang2025OpenMOAAPythonLibraryforUtilitarianOnlineLearning,
    title={{OpenMOA}: A Python Library for Utilitarian Online Learning},
    author={Zhili Wang, Heitor M.Gomes and Yi He},
    year={2025},
    eprint={},
    archivePrefix={arXiv},
    primaryClass={cs.LG},
    url={https://arxiv.org/abs/},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openmoa-1.0.1.tar.gz (60.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

openmoa-1.0.1-py3-none-any.whl (60.5 MB view details)

Uploaded Python 3

File details

Details for the file openmoa-1.0.1.tar.gz.

File metadata

  • Download URL: openmoa-1.0.1.tar.gz
  • Upload date:
  • Size: 60.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for openmoa-1.0.1.tar.gz
Algorithm Hash digest
SHA256 ae838458b84fabccbd913019c16d6e5b328f3c38345adf0aa72f82d52ae59903
MD5 060491d11da10ad04e8e3be1936c8c4a
BLAKE2b-256 36596d999f7363fd4789e031e22b00b2874a0daa6d25cb7b65ce3fdf877308a8

See more details on using hashes here.

Provenance

The following attestation bundles were made for openmoa-1.0.1.tar.gz:

Publisher: release.yml on ZW-SIYUAN/OpenMOA

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file openmoa-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: openmoa-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 60.5 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for openmoa-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 78964fa5c29f24955cea573d6ffe6eab9765d8544d72f11c638e907bb05f4bb2
MD5 ab985ac96817510099256262e856fdaf
BLAKE2b-256 666207d5a0a627d5f987783f144a91daf3bc3fcf400cf03df2732bea824c478b

See more details on using hashes here.

Provenance

The following attestation bundles were made for openmoa-1.0.1-py3-none-any.whl:

Publisher: release.yml on ZW-SIYUAN/OpenMOA

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page