Skip to main content

General-purpose Multimodal Transformer with Linear Complexity Attention Mechanism.

Project description

LinMulT

General-purpose Multimodal Transformer with Linear-Complexity Attention

CI Coverage Docs PyPI Python PyTorch Code style: ruff License


LinMulT is a modular Transformer library designed for multimodal sequence modelling. It handles variable-length inputs across any number of modalities, supports missing-modality scenarios, and offers six attention variants ranging from O(N²) softmax to O(N·s) gated linear attention; all behind a single config file.

Features

Multiple modalities 1–N input sequences with independent lengths and feature dims
Standard attention softmax — quadratic complexity for baselines and ablations
Efficient attention linear, performer, flash, bigbird — sub-quadratic complexity
Flexible heads sequence, aggregated, upsample, downsample — mix freely
Missing modalities zero-mask a modality; model handles it gracefully
Config-driven dict or YAML; no subclassing required

Installation

pip install linmult

For development:

git clone https://github.com/fodorad/linmult
cd linmult
pip install -e ".[dev,docs]"
make check

Quick start

LinT — single-modality transformer

import torch
from linmult import LinT

x = torch.rand(8, 1500, 25)  # (batch, time, features)

model = LinT({
    'input_feature_dim': 25,
    'heads': [{'name': 'out', 'type': 'simple', 'output_dim': 5}],
    'time_dim_reducer': 'attentionpool',  # aggregate over time
})
result = model(x)
assert result['out'].shape == (8, 5)

LinMulT — multimodal transformer

import torch
from linmult import LinMulT

x1 = torch.rand(8, 1500, 25)  # (batch, time, features)
x2 = torch.rand(8,  450, 35)
x3 = torch.rand(8,  450, 256)

model = LinMulT({
    'input_feature_dim': [25, 35, 256],
    'heads': [{'name': 'sentiment', 'type': 'simple', 'output_dim': 3}],
    'time_dim_reducer': 'gap',
})
result = model([x1, x2, x3])
assert result['sentiment'].shape == (8, 3)

Switching attention type

model = LinT({
    'input_feature_dim': 64,
    'heads': [{'name': 'out', 'type': 'simple', 'output_dim': 10}],
    'attention_type': 'flash',        # linear, performer, flash, bigbird, softmax, mha
    'flash_query_key_dim': 32,        # flash (GAU) scoring dimension
})

Documentation

API reference

Config reference

Quick-start notebook

Attention benchmark

UR-Funny training example


Similar projects using LinMulT

BlinkLinMulT (2023)

LinMulT trained for blink presence detection and eye state recognition across 7 public benchmark databases.

PersonalityLinMulT (2022)

LinMulT trained for Big Five personality trait estimation and sentiment analysis (MOSI, MOSEI, First Impressions V2).


Citation

If you found this work helpful, please cite the relevant paper:

Eye blink detection (2023)

@article{blinklinmult-fodor23,
  title   = {BlinkLinMulT: Transformer-based Eye Blink Detection},
  author  = {Fodor, {\'A}d{\'a}m and Fenech, Kristian and L{\H{o}}rincz, Andr{\'a}s},
  journal = {Journal of Imaging},
  pages   = {1--19},
  year    = {2023}
}

Personality and sentiment estimation (2022)

@InProceedings{pmlr-v173-fodor22a,
  title     = {Multimodal Sentiment and Personality Perception Under Speech:
               A Comparison of Transformer-based Architectures},
  author    = {Fodor, {\'A}d{\'a}m and Saboundji, Rachid R. and
               Jacques Junior, Julio C. S. and Escalera, Sergio and
               Gallardo-Pujol, David and L{\H{o}}rincz, Andr{\'a}s},
  booktitle = {Understanding Social Behavior in Dyadic and Small Group Interactions},
  pages     = {218--241},
  year      = {2022},
  volume    = {173},
  series    = {Proceedings of Machine Learning Research},
  publisher = {PMLR},
  url       = {https://proceedings.mlr.press/v173/fodor22a.html}
}

Contact

Ádám Fodoradamfodor.com · fodorad201@gmail.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

linmult-2.0.1.tar.gz (70.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

linmult-2.0.1-py3-none-any.whl (42.0 kB view details)

Uploaded Python 3

File details

Details for the file linmult-2.0.1.tar.gz.

File metadata

  • Download URL: linmult-2.0.1.tar.gz
  • Upload date:
  • Size: 70.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for linmult-2.0.1.tar.gz
Algorithm Hash digest
SHA256 39bdc7ff6310789b279dfe85f18df5c914e67caee1cbf15491d77fae3f71aac0
MD5 cebf47771478e860e33f8cbc714cde28
BLAKE2b-256 3f43bf6e7e3e113210c60155bdddd4dcf527984049b3e48f39eec693a72a1ddc

See more details on using hashes here.

Provenance

The following attestation bundles were made for linmult-2.0.1.tar.gz:

Publisher: cd.yml on fodorad/LinMulT

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file linmult-2.0.1-py3-none-any.whl.

File metadata

  • Download URL: linmult-2.0.1-py3-none-any.whl
  • Upload date:
  • Size: 42.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for linmult-2.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e295be7e11a0e02a6338f0be67616d89ae36bd40e2fdc07f7f8200883c2f9679
MD5 9774354843818ad279212e2eda4f4252
BLAKE2b-256 fb3a5a84a674925fbcf7861dbb823a9ad4d3393cfdd8f1e5b133ed58f9bf0797

See more details on using hashes here.

Provenance

The following attestation bundles were made for linmult-2.0.1-py3-none-any.whl:

Publisher: cd.yml on fodorad/LinMulT

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page