General-purpose Multimodal Transformer with Linear Complexity Attention Mechanism.
Project description
LinMulT is a modular Transformer library designed for multimodal sequence modelling. It handles variable-length inputs across any number of modalities, supports missing-modality scenarios, and offers six attention variants ranging from O(N²) softmax to O(N·s) gated linear attention; all behind a single config file.
Features
| Multiple modalities | 1–N input sequences with independent lengths and feature dims |
| Standard attention | softmax — quadratic complexity for baselines and ablations |
| Efficient attention | linear, performer, flash, bigbird — sub-quadratic complexity |
| Flexible heads | sequence, aggregated, upsample, downsample — mix freely |
| Missing modalities | zero-mask a modality; model handles it gracefully |
| Config-driven | dict or YAML; no subclassing required |
Installation
pip install linmult
For development:
git clone https://github.com/fodorad/linmult
cd linmult
pip install -e ".[dev,docs]"
make check
Quick start
LinT — single-modality transformer
import torch
from linmult import LinT
x = torch.rand(8, 1500, 25) # (batch, time, features)
model = LinT({
'input_feature_dim': 25,
'heads': [{'name': 'out', 'type': 'simple', 'output_dim': 5}],
'time_dim_reducer': 'attentionpool', # aggregate over time
})
result = model(x)
assert result['out'].shape == (8, 5)
LinMulT — multimodal transformer
import torch
from linmult import LinMulT
x1 = torch.rand(8, 1500, 25) # (batch, time, features)
x2 = torch.rand(8, 450, 35)
x3 = torch.rand(8, 450, 256)
model = LinMulT({
'input_feature_dim': [25, 35, 256],
'heads': [{'name': 'sentiment', 'type': 'simple', 'output_dim': 3}],
'time_dim_reducer': 'gap',
})
result = model([x1, x2, x3])
assert result['sentiment'].shape == (8, 3)
Switching attention type
model = LinT({
'input_feature_dim': 64,
'heads': [{'name': 'out', 'type': 'simple', 'output_dim': 10}],
'attention_type': 'flash', # linear, performer, flash, bigbird, softmax, mha
'flash_query_key_dim': 32, # flash (GAU) scoring dimension
})
Documentation
Similar projects using LinMulT
BlinkLinMulT (2023)
LinMulT trained for blink presence detection and eye state recognition across 7 public benchmark databases.
PersonalityLinMulT (2022)
LinMulT trained for Big Five personality trait estimation and sentiment analysis (MOSI, MOSEI, First Impressions V2).
- Paper: Multimodal Sentiment and Personality Perception Under Speech
- Code: github.com/fodorad/PersonalityLinMulT
Citation
If you found this work helpful, please cite the relevant paper:
Eye blink detection (2023)
@article{blinklinmult-fodor23,
title = {BlinkLinMulT: Transformer-based Eye Blink Detection},
author = {Fodor, {\'A}d{\'a}m and Fenech, Kristian and L{\H{o}}rincz, Andr{\'a}s},
journal = {Journal of Imaging},
pages = {1--19},
year = {2023}
}
Personality and sentiment estimation (2022)
@InProceedings{pmlr-v173-fodor22a,
title = {Multimodal Sentiment and Personality Perception Under Speech:
A Comparison of Transformer-based Architectures},
author = {Fodor, {\'A}d{\'a}m and Saboundji, Rachid R. and
Jacques Junior, Julio C. S. and Escalera, Sergio and
Gallardo-Pujol, David and L{\H{o}}rincz, Andr{\'a}s},
booktitle = {Understanding Social Behavior in Dyadic and Small Group Interactions},
pages = {218--241},
year = {2022},
volume = {173},
series = {Proceedings of Machine Learning Research},
publisher = {PMLR},
url = {https://proceedings.mlr.press/v173/fodor22a.html}
}
Contact
Ádám Fodor — adamfodor.com · fodorad201@gmail.com
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file linmult-2.0.1.tar.gz.
File metadata
- Download URL: linmult-2.0.1.tar.gz
- Upload date:
- Size: 70.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
39bdc7ff6310789b279dfe85f18df5c914e67caee1cbf15491d77fae3f71aac0
|
|
| MD5 |
cebf47771478e860e33f8cbc714cde28
|
|
| BLAKE2b-256 |
3f43bf6e7e3e113210c60155bdddd4dcf527984049b3e48f39eec693a72a1ddc
|
Provenance
The following attestation bundles were made for linmult-2.0.1.tar.gz:
Publisher:
cd.yml on fodorad/LinMulT
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
linmult-2.0.1.tar.gz -
Subject digest:
39bdc7ff6310789b279dfe85f18df5c914e67caee1cbf15491d77fae3f71aac0 - Sigstore transparency entry: 1097150412
- Sigstore integration time:
-
Permalink:
fodorad/LinMulT@a202d2626ed2a47a569ca3cab3b4e4468413a832 -
Branch / Tag:
refs/tags/v2.0.1 - Owner: https://github.com/fodorad
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
cd.yml@a202d2626ed2a47a569ca3cab3b4e4468413a832 -
Trigger Event:
push
-
Statement type:
File details
Details for the file linmult-2.0.1-py3-none-any.whl.
File metadata
- Download URL: linmult-2.0.1-py3-none-any.whl
- Upload date:
- Size: 42.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e295be7e11a0e02a6338f0be67616d89ae36bd40e2fdc07f7f8200883c2f9679
|
|
| MD5 |
9774354843818ad279212e2eda4f4252
|
|
| BLAKE2b-256 |
fb3a5a84a674925fbcf7861dbb823a9ad4d3393cfdd8f1e5b133ed58f9bf0797
|
Provenance
The following attestation bundles were made for linmult-2.0.1-py3-none-any.whl:
Publisher:
cd.yml on fodorad/LinMulT
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
linmult-2.0.1-py3-none-any.whl -
Subject digest:
e295be7e11a0e02a6338f0be67616d89ae36bd40e2fdc07f7f8200883c2f9679 - Sigstore transparency entry: 1097150414
- Sigstore integration time:
-
Permalink:
fodorad/LinMulT@a202d2626ed2a47a569ca3cab3b4e4468413a832 -
Branch / Tag:
refs/tags/v2.0.1 - Owner: https://github.com/fodorad
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
cd.yml@a202d2626ed2a47a569ca3cab3b4e4468413a832 -
Trigger Event:
push
-
Statement type: