Skip to main content

Python implementation of the Multi-View Stacking algorithm.

Project description

multiviewstacking: a python implementation of the Multi-View Stacking algorithm

Multi-View learning algorithms aim to learn from different representational views. For example, a movie can be represented by three views. The sequence of images, the audio, and the subtitles. Instead of concatenating the features of every view and training a single model, the Multi-View Stacking algorithm[1] builds independent (and possibly of different types) models for each view. These models are called first-level-learners. Then, the class and score predictions of the first-level-learners are used as features to train another model called the meta-learner. This approach is based on the Stacked Generalization method proposed by Wolpert D. H.[2].

The multiviewstacking package provides the following functionalities:

  • Train Multi-View Stacking classifiers.
  • Supports arbitrary number of views. The limit is your computer's memory.
  • Use any scikit-learn classifier as first-level-learner and meta-learner.
  • Use any custom model as long as they implement the fit(), predict(), and predict_proba() methods.
  • Combine different types of first-level-learners.
  • Comes with a pre-loaded dataset with two views for testing.

Requirements

  • Python 3.11.0+
  • pandas >= 3.0.0
  • numpy >= 2.0.2
  • scikit-learn >= 1.5.2

Installation

You can install the multiviewstacking package with:

pip install multiviewstacking

Quick start example

This quick start example shows you how to train a multi-view model. For more detailed tutorials, check the jupyter notebooks in the /examples directory.

import numpy as np
from multiviewstacking import load_example_data
from multiviewstacking import MultiViewStacking
from sklearn.ensemble import RandomForestClassifier

# Load the built-in example dataset.
(xtrain,ytrain,xtest,ytest,ind1,ind2,l) = load_example_data()

The built-in dataset contains features for two views (audio, accelerometer) for activity recognition. The load_example_data() method returns a tuple with the train and test sets. It also returns the column indices for the two views and a LabelEnconder to convert the classes from integers back to strings.

# Define two first-level-learners and the meta-learner.
# All of them are Random Forests but they can be any other model.
m_v1 = RandomForestClassifier(n_estimators=50, random_state=123)
m_v2 = RandomForestClassifier(n_estimators=50, random_state=123)
m_meta = RandomForestClassifier(n_estimators=50, random_state=123)

# Train the model.
model = MultiViewStacking(views_indices = [ind1, ind2],
                      first_level_learners = [m_v1, m_v2],
                      meta_learner = m_meta)

The view_indices parameter is a list of lists. Each list specifies the column indices of the train set for each view. In this case ind1 stores the indices of the audio features and ind2 contains the indices of the accelerometer features. Th first_level_learners parameter is a list of scikit-learn models or any other custom models. The meta-learnr specifies the model to be used as the meta-learner.

# Train the model.
model.fit(xtrain, ytrain)

# Make predictions on the test set.
preds = model.predict(xtest)

# Compuet the accuracy.
np.sum(ytest == preds) / len(ytest)

Citation

To cite this package use:

Enrique Garcia-Ceja (2024). multiviewstacking: A python implementation of the Multi-View Stacking algorithm.
Python package https://github.com/enriquegit/multiviewstacking

BibTex entry for LaTeX:

@Manual{MVS,
    title = {multiviewstacking: A python implementation of the Multi-View Stacking algorithm},
    author = {Enrique Garcia-Ceja},
    year = {2024},
    note = {Python package},
    url = {https://github.com/enriquegit/multiviewstacking}
}

References

[1] Garcia-Ceja, Enrique, et al. "Multi-view stacking for activity recognition with sound and accelerometer data." Information Fusion 40 (2018): 45-56.

[2] Wolpert, D. H. (1992). Stacked generalization. Neural networks, 5(2), 241-259.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

multiviewstacking-0.7.4.tar.gz (428.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

multiviewstacking-0.7.4-py3-none-any.whl (429.3 kB view details)

Uploaded Python 3

File details

Details for the file multiviewstacking-0.7.4.tar.gz.

File metadata

  • Download URL: multiviewstacking-0.7.4.tar.gz
  • Upload date:
  • Size: 428.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for multiviewstacking-0.7.4.tar.gz
Algorithm Hash digest
SHA256 47ae16cfd0596b264c588fec4f42d9b058b2b47291fe23a5a6b2e505d0dd977b
MD5 0abfe3f23df2502a866d9ca6eb3b6bc7
BLAKE2b-256 b0cab674ca777a3ef3c5b656933b64b1569d1bb65b85bd7ae5014defe2861154

See more details on using hashes here.

File details

Details for the file multiviewstacking-0.7.4-py3-none-any.whl.

File metadata

File hashes

Hashes for multiviewstacking-0.7.4-py3-none-any.whl
Algorithm Hash digest
SHA256 93e1974a6b4b89a49046d02c73b516ae020830a0f6f3a86f8bc1b0553dd95d65
MD5 ae0d4f48bd406cfb35dd6b900c37a1c5
BLAKE2b-256 6394dbf334d08f2a7817abab73849e364d405a2d0ccced7fe6364feaefacd1d9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page