Skip to main content

Deep Ensemble Energy Models - RBM-based ensemble aggregation for crowd learning and classifier combination

Project description

DEEM - Deep Ensemble Energy Models

PyPI version Python 3.8+ License: MIT

DEEM is a Python library for training Restricted Boltzmann Machines (RBMs) on ensemble predictions from multiple classifiers. It provides a scikit-learn compatible API for unsupervised ensemble aggregation, crowd learning, and model combination.

Features

  • 🚀 Simple 3-line API - Fit and predict in just a few lines of code
  • 🔬 Unsupervised Learning - No labels required for training (though they can be used for evaluation)
  • 🧮 Energy-Based Models - Uses RBMs to learn the joint distribution of classifier predictions
  • 🎯 Transparent Hungarian Alignment - Automatic label permutation handling via align_to parameter
  • GPU Acceleration - Full PyTorch backend with CUDA support
  • 🔧 Scikit-learn Compatible - Standard .fit(), .predict(), .score() interface
  • 📊 Automatic Hyperparameters - Optional meta-learning for hyperparameter selection
  • 🎚️ Weighted Initialization - Better classifiers get more influence (default ON)
  • 📈 Soft Label Support - Works with probability distributions (3D tensors)

Installation

pip install deem

From Source

git clone https://github.com/Rem4rkable/deem.git
cd rbm_python
pip install -e .

Quick Start

import numpy as np
from deem import DEEM

# Ensemble predictions from 15 classifiers on 100 samples with 3 classes
predictions = np.random.randint(0, 3, (100, 15))

# Train and predict in 3 lines!
model = DEEM()
model.fit(predictions)
consensus = model.predict(predictions)

With Evaluation (Transparent Label Alignment)

# If you have true labels, evaluate with automatic label alignment
model = DEEM(n_classes=3, epochs=50)
model.fit(train_predictions)

# Predict with transparent Hungarian alignment via align_to parameter
# Alignment is against MAJORITY VOTE, not true labels (unsupervised!)
consensus = model.predict(test_predictions, align_to=train_predictions)

# Or use score() which handles alignment automatically
accuracy = model.score(test_predictions, test_labels)
print(f"Consensus accuracy: {accuracy:.2%}")

With Soft Labels (Probability Distributions)

# Soft predictions: (n_samples, n_classes, n_classifiers)
# Each classifier outputs a probability distribution over classes
soft_predictions = np.random.rand(100, 3, 15)
soft_predictions = soft_predictions / soft_predictions.sum(axis=1, keepdims=True)

model = DEEM(n_classes=3)
model.fit(soft_predictions)  # Automatically detects 3D and enables oh_mode
consensus = model.predict(soft_predictions)

Custom Configuration

model = DEEM(
    n_classes=5,
    hidden_dim=1,           # Number of hidden units (keep at 1 for best results)
    learning_rate=0.01,
    epochs=100,
    batch_size=64,
    cd_k=10,                # Contrastive divergence steps
    deterministic=True,     # Use probabilities (more stable)
    use_weighted=True,      # Scale weights by classifier accuracy (default)
    device='cuda'           # Use GPU
)
model.fit(predictions)

Use Cases

1. Crowd Learning / Multi-Annotator Aggregation

Aggregate noisy labels from multiple human annotators:

# annotator_labels: (n_samples, n_annotators) with values 0 to k-1
model = DEEM(n_classes=k)
model.fit(annotator_labels)
consensus_labels = model.predict(annotator_labels)

2. Ensemble Model Aggregation

Combine predictions from multiple trained classifiers:

# Get predictions from multiple models
predictions = np.column_stack([
    model1.predict(X),
    model2.predict(X),
    model3.predict(X),
    # ... more models
])

# Learn optimal aggregation
ensemble = DEEM()
ensemble.fit(predictions)
final_predictions = ensemble.predict(predictions)

3. Missing Predictions

DEEM automatically handles cases where some classifiers don't provide predictions (use -1 for missing):

predictions = np.array([
    [0, 1, -1, 2, 1],  # Classifier 3 missing
    [1, 1, 1, -1, 1],  # Classifier 4 missing
    # ...
])
model = DEEM(n_classes=3)
model.fit(predictions)  # Missing values handled automatically

4. Weighted Initialization (Default ON)

By default, DEEM scales RBM weights by each classifier's agreement with majority vote. This gives better classifiers more influence:

# Weighted initialization is ON by default (recommended)
model = DEEM(n_classes=3)  # use_weighted=True by default
model.fit(predictions)

# Disable for ablation studies only
model = DEEM(n_classes=3, use_weighted=False)
model.fit(predictions)  # All classifiers treated equally

How It Works

DEEM uses Restricted Boltzmann Machines (RBMs) - energy-based models that learn the joint probability distribution over classifier predictions and hidden representations. The key insight is that multiple weak classifiers contain complementary information that can be combined through unsupervised learning.

Key Components

  1. Energy Function: Models compatibility between visible (predictions) and hidden (consensus) states
  2. Contrastive Divergence: Trains the RBM using GWG (Gibbs with Gradients) sampling
  3. Hungarian Algorithm: Solves the label permutation problem during evaluation
  4. Weighted Initialization: Scales weights by classifier quality (agreement with majority vote)
  5. Buffer Initialization: Automatic sampler buffer initialization for better MCMC mixing

Architecture

Classifier Predictions → [Preprocessing] → RBM → Hidden Representation → Consensus Label
     (visible layer)       (optional)           (hidden layer)

What Happens Under the Hood

When you call model.fit(predictions):

  1. Data Preparation: Filters samples with all missing values, infers n_classes
  2. Buffer Initialization: First batch initializes MCMC sampler buffer (transparent)
  3. Weighted Initialization: RBM weights scaled by classifier accuracy vs majority vote
  4. Training: Contrastive divergence learns the energy function
  5. Result: Model ready to predict consensus labels

When you call model.predict(data, align_to=reference):

  1. Forward Pass: Compute hidden layer probabilities
  2. Argmax: Get predicted class for each sample
  3. Hungarian Alignment: Match predicted labels to majority vote of reference
  4. Return: Aligned consensus predictions

API Reference

DEEM

Main class for ensemble aggregation.

Core Parameters:

  • n_classes (int, optional): Number of classes. Auto-detected if not specified.
  • hidden_dim (int, default=1): Number of hidden units. Keep at 1 for best results.
  • cd_k (int, default=10): Contrastive divergence steps.
  • deterministic (bool, default=True): Use probabilities instead of sampling.
  • learning_rate (float, default=0.001): Learning rate for SGD.
  • momentum (float, default=0.9): SGD momentum.
  • epochs (int, default=100): Training epochs.
  • batch_size (int, default=128): Batch size.
  • device (str, default='auto'): Device ('cpu', 'cuda', or 'auto').
  • random_state (int, optional): Random seed.

Phase 3 Parameters (New):

  • use_weighted (bool, default=True): Scale weights by classifier accuracy vs majority vote.
    • Recommended: Keep True for production use.
    • Set to False only for ablation studies.
  • auto_hyperparameters (bool, default=False): Auto-select hyperparameters based on data.
  • model_dir (str, optional): Path to hyperparameter predictor models.

Preprocessing Parameters (Advanced):

  • use_preprocessing (bool, default=False): Add Multinomial preprocessing layers.
  • preprocessing_layers (int, default=1): Number of preprocessing layers.
  • preprocessing_activation (str, default='sparsemax'): Activation function.
  • preprocessing_init (str, default='identity'): Weight initialization method.

Sampler Parameters (Advanced):

  • sampler_steps (int, default=5): MCMC sampling steps.
  • sampler_oh_mode (bool, default=False): One-hot mode (auto-enabled for soft labels).

Methods:

  • fit(predictions, labels=None, **kwargs): Train the model (unsupervised)
  • predict(predictions, return_probs=False, align_to=None): Get consensus predictions
    • align_to: Reference data for Hungarian alignment (uses majority vote)
  • score(predictions, true_labels): Compute accuracy with automatic alignment
  • get_class_mapping(): Get the cached Hungarian class mapping
  • reset_class_mapping(): Clear the cached mapping
  • save(path): Save model to disk
  • load(path): Load model from disk
  • get_params(): Get parameters (sklearn compatibility)
  • set_params(**params): Set parameters (sklearn compatibility)

Attributes (after fit):

  • model_: The trained RBM model
  • class_map_: Hungarian mapping (after alignment)
  • n_classes_: Number of classes
  • n_classifiers_: Number of classifiers
  • history_: Training history
  • is_fitted_: Whether model is trained

Advanced Features

Transparent Hungarian Alignment

The align_to parameter provides transparent control over label alignment:

# Align predictions to majority vote of training data
train_consensus = model.predict(train_preds, align_to=train_preds)

# Same alignment is cached and reused
test_consensus = model.predict(test_preds)  # Uses cached class_map_

# Force recomputation
model.reset_class_mapping()
new_consensus = model.predict(test_preds, align_to=test_preds)

# Get the mapping for inspection
mapping = model.get_class_mapping()
print(f"Predicted class 0 → Aligned class {mapping[0]}")

Important: Alignment is always against MAJORITY VOTE, not true labels. This preserves the unsupervised nature of the model.

Automatic Hyperparameter Selection

NEW: Hyperparameter prediction now works out-of-the-box! No need to specify model_dir.

# Simple - uses bundled trained models automatically
model = DEEM(
    n_classes=10,
    auto_hyperparameters=True  # That's it! 
)
model.fit(predictions, verbose=True)  # Shows auto-selected hyperparameters

# Still optionally override specific hyperparameters
model = DEEM(
    auto_hyperparameters=True,
    epochs=100,  # Override auto-selected epochs
    batch_size=512  # Override auto-selected batch size
)

# Use custom trained models (for experiments)
model = DEEM(
    auto_hyperparameters=True,
    model_dir='path/to/custom/models'  # Optional
)

What gets predicted automatically:

  • batch_size: Training batch size
  • epochs: Number of training epochs
  • learning_rate: Optimizer learning rate
  • init_method: Weight initialization (mv_rand, mv_lo, rand)
  • num_layers: Number of preprocessing layers
  • activation_func: Preprocessing activation function
  • momentum: Optimizer momentum
  • scheduler: Learning rate scheduler

The predictor uses dataset meta-features (n_samples, n_classifiers, token_density, etc.) to select optimal hyperparameters.

Soft Labels (Probability Distributions)

DEEM automatically detects and handles soft labels (3D tensors):

# soft_predictions: (n_samples, n_classes, n_classifiers)
# Each entry is a probability distribution over classes
soft_predictions = classifier_probabilities  # Shape: (100, 3, 15)

model = DEEM(n_classes=3)
model.fit(soft_predictions)  # Auto-detects 3D, enables oh_mode
consensus = model.predict(soft_predictions)

What happens automatically:

  • Detects 3D tensor → enables one-hot sampler mode
  • Infers n_classes from tensor shape (dim 1)
  • Normalizes probabilities if needed

Save and Load Models

# Save trained model
model.save('my_ensemble.pt')

# Load later
model = DEEM()
model.load('my_ensemble.pt')
predictions = model.predict(new_data)

Preprocessing Layers (Advanced)

For complex datasets, add learnable preprocessing layers:

model = DEEM(
    n_classes=3,
    use_preprocessing=True,
    preprocessing_layers=1,
    preprocessing_activation='entmax',
    preprocessing_init='identity'
)
model.fit(predictions)

Disabling Weighted Initialization (Ablation Studies)

# Default behavior: classifiers weighted by accuracy vs majority vote
model = DEEM(n_classes=3, use_weighted=True)  # Default

# For ablation studies: disable weighting
model = DEEM(n_classes=3, use_weighted=False)
model.fit(predictions)  # All classifiers treated equally

Hyperparameters Reference

Core Training Parameters

Parameter Type Default Description
n_classes int, optional None Number of classes. Auto-inferred from data if not provided.
hidden_dim int 1 Number of hidden units. Keep at 1 for best results.
learning_rate float 0.001 Learning rate for optimizer.
epochs int 50 Number of training epochs.
batch_size int 128 Training batch size. Larger = faster, smaller = better mixing.
momentum float 0.0 SGD momentum. Range: [0, 1].
device str 'auto' Device: 'auto', 'cpu', or 'cuda'. Auto selects GPU if available.

RBM Model Parameters

Parameter Type Default Description
cd_k int 10 Contrastive divergence steps. Higher = better but slower.
deterministic bool True Use probabilities (True) vs sampling (False). Keep True for stability.
init_method str 'mv_rand' Weight init: 'mv_rand' (majority vote + random), 'mv_lo', or 'rand'.
use_weighted bool True Scale weights by classifier accuracy vs majority vote. Recommended: True.

Sampler Parameters

Parameter Type Default Description
sampler_steps int 10 MCMC sampler steps. More steps = better samples but slower.
sampler_oh_mode bool False One-hot mode for sampler. Auto-enabled for soft labels (3D tensors).

Preprocessing Parameters

Parameter Type Default Description
use_preprocessing bool False Enable learnable preprocessing layers before RBM.
preprocessing_layers int 0 Number of Multinomial layers. Ignored if preprocessing_layer_widths specified.
preprocessing_layer_widths list[int], optional None Custom layer widths, e.g., [20, 15, 10]. Overrides preprocessing_layers.
preprocessing_activation str 'sparsemax' Activation: 'sparsemax', 'entmax', 'softmax', 'relu', 'gelu', etc.
preprocessing_init str 'identity' Preprocessing init: 'identity', 'rand', 'mv'.
preprocessing_one_hot bool False Use one-hot encoding in preprocessing.
preprocessing_use_softmax bool False Apply softmax in preprocessing layers.
preprocessing_jitter float 0.0 Jitter coefficient for preprocessing.

AutoML Parameters

Parameter Type Default Description
auto_hyperparameters bool False Enable automatic hyperparameter selection. Works out-of-the-box!
model_dir str/Path, optional None Custom model directory. None = use bundled models.

Other Parameters

Parameter Type Default Description
random_state int, optional None Random seed for reproducibility.
kwargs dict {} Additional kwargs passed to RBM model (e.g., custom init_method).

Soft Labels Guide

DEEM supports both hard labels (integers) and soft labels (probability distributions).

Hard Labels (Standard)

# Shape: (n_samples, n_classifiers)
# Each entry is an integer class label: 0, 1, 2, ..., k-1
hard_predictions = np.array([
    [0, 1, 0, 2, 0],  # Sample 1: classifiers predict 0, 1, 0, 2, 0
    [1, 1, 2, 1, 1],  # Sample 2: classifiers predict 1, 1, 2, 1, 1
    [2, 2, 2, 1, 2],  # Sample 3: classifiers predict 2, 2, 2, 1, 2
])

model = DEEM(n_classes=3)
model.fit(hard_predictions)

Soft Labels (Probability Distributions)

# Shape: (n_samples, n_classes, n_classifiers)
# Each entry is a probability distribution over classes
soft_predictions = np.array([
    # Sample 1 - 5 classifiers, each gives 3-class distribution
    [[0.7, 0.2, 0.1],  # Classifier 1: 70% class 0, 20% class 1, 10% class 2
     [0.1, 0.8, 0.1],  # Classifier 2: 10% class 0, 80% class 1, 10% class 2
     [0.6, 0.3, 0.1],  # Classifier 3: ...
     [0.2, 0.3, 0.5],  # Classifier 4: ...
     [0.8, 0.1, 0.1]], # Classifier 5: ...
    
    # Sample 2
    [[0.1, 0.7, 0.2],
     [0.2, 0.6, 0.2],
     [0.3, 0.4, 0.3],
     [0.1, 0.8, 0.1],
     [0.2, 0.7, 0.1]],
])
# Shape: (2, 3, 5) = (n_samples=2, n_classes=3, n_classifiers=5)

# DEEM automatically detects 3D tensors as soft labels!
model = DEEM()  # n_classes inferred from shape: 3
model.fit(soft_predictions)  # Auto-enables oh_mode=True
consensus = model.predict(soft_predictions)  # Returns hard labels: [0, 1, ...]

What happens automatically with soft labels:

  1. 3D tensor detection: Checks if predictions.shape[0] == 3
  2. Auto-infer n_classes: Extracts from predictions.shape[1]
  3. Enable oh_mode: Sets sampler_oh_mode=True automatically
  4. Proper preprocessing: Converts distributions to one-hot during sampling

Converting between formats:

# Soft → Hard (argmax)
hard_preds = soft_preds.argmax(axis=1)  # (N, K, D) → (N, D)

# Hard → Soft (one-hot)
from torch.nn.functional import one_hot
soft_preds = one_hot(torch.tensor(hard_preds), num_classes=3).numpy()
soft_preds = soft_preds.transpose(0, 2, 1)  # (N, D, K) → (N, K, D)

Preprocessing Guide

Preprocessing layers learn to transform classifier outputs before the RBM, useful for:

  • Handling heterogeneous classifiers (different architectures)
  • Learning calibration/alignment of classifier outputs
  • Dimensionality transformation

Basic Preprocessing (Fixed Width)

# Add 2 preprocessing layers, each maintains same width (n_classifiers)
model = DEEM(
    n_classes=3,
    use_preprocessing=True,
    preprocessing_layers=2,  # Number of layers
    preprocessing_activation='entmax',  # Activation function
    preprocessing_init='identity',  # Initialize to identity transform
)
# Architecture: input(15) → Multinomial(15) → Multinomial(15) → RBM(15)

Custom Width Preprocessing

# Transform classifier dimension: 15 → 20 → 15 → 10
model = DEEM(
    n_classes=3,
    use_preprocessing=True,
    preprocessing_layer_widths=[20, 15, 10],  # Explicit widths
    preprocessing_activation='sparsemax',
)
# Architecture: input(15) → Multinomial(20) → Multinomial(15) → Multinomial(10) → RBM(10)

Activation Functions

Activation Description When to Use
sparsemax Sparse softmax (default) General purpose, encourages sparsity
entmax Entmax-1.5 Adaptive sparsity, good for ensembles
softmax Standard softmax Dense distributions
relu ReLU Non-negative outputs
gelu GELU Smooth non-linearity

Initialization Methods

Init Method Description When to Use
identity Identity transform (default) Preserve input initially
rand Random weights Learn from scratch
mv Majority vote statistics Start near MV baseline

Example: YAML Config Equivalent

# Research code YAML config:
# model:
#   multinomial_net:
#     num_layers: 1
#     activation_func_name: 'entmax'
#     init_method: 'identity'
#     one_hot: False

# Python API equivalent:
model = DEEM(
    n_classes=3,
    use_preprocessing=True,
    preprocessing_layers=1,
    preprocessing_activation='entmax',
    preprocessing_init='identity',
    preprocessing_one_hot=False,
)

Troubleshooting

Model starts with low accuracy (~40%)

Problem: Model initialized with random weights instead of majority vote.

Solution: Check init_method:

model = DEEM(init_method='mv_rand')  # Should start near MV accuracy

If using auto_hyperparameters=True, this is handled automatically.

Double printing in Jupyter notebooks

Problem: Buffer/weighted init messages print twice with identical timestamps.

Solution: This is a Jupyter display quirk with %autoreload 2, not actual duplicate calls. The functionality works correctly. To avoid, disable autoreload or ignore the duplicate.

AutoML models not found

Problem: model_dir not found warning when using auto_hyperparameters=True.

Solution (v0.2.0+): Models are now bundled! Just upgrade:

pip install --upgrade deem

For older versions, download models separately or specify model_dir.

Soft labels not detected

Problem: 3D tensor not auto-detected as soft labels.

Check:

  1. Shape must be (n_samples, n_classes, n_classifiers) - note the order!
  2. Data type must be float (not int)
  3. Values should sum to ~1.0 along axis 1 (probability distributions)
# Verify shape and normalization
print(f"Shape: {soft_preds.shape}")  # Should be (N, K, D)
print(f"Sum along axis 1: {soft_preds.sum(axis=1)}")  # Should be ~1.0

Out of memory on GPU

Problem: CUDA out of memory error.

Solutions:

  1. Reduce batch_size: model = DEEM(batch_size=64)
  2. Use CPU: model = DEEM(device='cpu')
  3. Use smaller dataset subset for experimentation

Predictions don't match labels (wrong alignment)

Problem: Accuracy looks random (~33% for 3 classes).

Solution: Use align_to parameter or score() method:

# Wrong - no alignment
accuracy = (predictions == labels).mean()  # ❌ Random accuracy

# Correct - with alignment
accuracy = model.score(predictions, labels)  # ✅ Proper accuracy
# Or:
consensus = model.predict(predictions, align_to=train_predictions)
accuracy = (consensus == labels).mean()

Remember: Alignment is against MAJORITY VOTE, not true labels (unsupervised).

Training is slow

Solutions:

  1. Use GPU: model = DEEM(device='cuda')
  2. Increase batch_size: model = DEEM(batch_size=512)
  3. Reduce cd_k: model = DEEM(cd_k=5) (default 10)
  4. Reduce sampler_steps: model = DEEM(sampler_steps=5) (default 10)
  5. Use fewer epochs: model = DEEM(epochs=20) (default 50)

Model not improving beyond majority vote

Check:

  1. use_weighted=True (default, gives better classifiers more influence)
  2. init_method='mv_rand' (starts near MV)
  3. Sufficient epochs (try 50-100)
  4. Dataset quality (are classifiers diverse and reasonably accurate?)

Note: DEEM is designed to match or slightly exceed majority vote. Large improvements (>5-10%) are rare and depend on dataset characteristics.

Requirements

Core Dependencies (installed automatically):

  • Python >= 3.8
  • PyTorch >= 1.9
  • NumPy >= 1.19
  • SciPy >= 1.7
  • entmax >= 1.0
  • scikit-learn >= 0.24 (for automatic hyperparameters)
  • pandas >= 1.3 (for automatic hyperparameters)
  • joblib >= 1.0 (for automatic hyperparameters)

Development (optional):

pip install deem[dev]  # Includes pytest, ruff

Citation

If you use DEEM in your research, please cite:

@software{deem2026,
  title={DEEM: Deep Ensemble Energy Models for Classifier Aggregation},
  author={[Your Name]},
  year={2026},
  url={https://github.com/Rem4rkable/deem}
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Acknowledgments

  • Based on research in energy-based models and crowd learning
  • Built with PyTorch and inspired by scikit-learn's API design

Links


Made with ❤️ for the machine learning community

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deem-0.2.0.tar.gz (1.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

deem-0.2.0-py3-none-any.whl (1.4 MB view details)

Uploaded Python 3

File details

Details for the file deem-0.2.0.tar.gz.

File metadata

  • Download URL: deem-0.2.0.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for deem-0.2.0.tar.gz
Algorithm Hash digest
SHA256 8c93591fc9241b32bbe22209028054539c0a7d896123d27fc6839d2b850900b6
MD5 92bf7d6983f6509d7e0bf1f2b4082a0d
BLAKE2b-256 f809c5b3137754292c8238ab4572fb0519d74c5da8f831fa9599dd06c3a61a62

See more details on using hashes here.

File details

Details for the file deem-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: deem-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 1.4 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for deem-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2f10ae18ab799bb6442c5859d4a932612cba27a00eed82ffe1fbff04966a7356
MD5 6074cf34ee313e9c393527b10e0e1bde
BLAKE2b-256 d224d2da66b58426cd399d815358310676957bd73f013937e21fce39a7aa4ae4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page