Multimodal Sentiment Analysis Framework
Project description
MMSA
A unified framwork for Multimodal Sentiment Analysis tasks.
Note: We strongly recommend browsing the overall structure of our code first. Feel free to contact us if you require any further information.
Supported Models
Type | Model Name | From |
---|---|---|
Single-Task | EF_LSTM | MultimodalDNN |
Single-Task | LF_DNN | - |
Single-Task | TFN | Tensor-Fusion-Network |
Single-Task | LMF | Low-rank-Multimodal-Fusion |
Single-Task | MFN | Memory-Fusion-Network |
Single-Task | Graph-MFN | Graph-Memory-Fusion-Network |
Single-Task | MulT(without CTC) | Multimodal-Transformer |
Single-Task | BERT-MAG | MAG-BERT |
Single-Task | MFM | - |
Single-Task | MISA | MISA |
Multi-Task | MLF_DNN | MMSA |
Multi-Task | MTFN | MMSA |
Multi-Task | MLMF | MMSA |
Multi-Task | SELF_MM | Self-MM |
Results
Detailed results are shown in results/result-stat.md
Usage
Clone codes
- Clone this repo and install requirements. Create virtual environments if needed.
git clone https://github.com/thuiar/MMSA
cd MMSA
# conda create -n mmsa python=3.6
pip install -r requirements.txt
Datasets and pre-trained berts
Download dataset features and pre-trained berts from the following links.
- Baidu Cloud Drive with code:
ctgs
- Google Cloud Drive
For all features, you can use SHA-1 Hash Value
to check the consistency.
MOSI/unaligned_50.pkl
:5da0b8440fc5a7c3a457859af27458beb993e088
MOSI/aligned_50.pkl
:5c62b896619a334a7104c8bef05d82b05272c71c
MOSEI/unaligned_50.pkl
:db3e2cff4d706a88ee156981c2100975513d4610
MOSEI/aligned_50.pkl
:ef49589349bc1c2bc252ccc0d4657a755c92a056
SIMS/unaligned_39.pkl
:a00c73e92f66896403c09dbad63e242d5af756f8
Due to the size limitations, the MOSEI features and SIMS raw videos are available in Baidu Cloud Drive
only. All dataset features are organized as:
{
"train": {
"raw_text": [],
"audio": [],
"vision": [],
"id": [], # [video_id$_$clip_id, ..., ...]
"text": [],
"text_bert": [],
"audio_lengths": [],
"vision_lengths": [],
"annotations": [],
"classification_labels": [], # Negative(< 0), Neutral(0), Positive(> 0)
"regression_labels": []
},
"valid": {***}, # same as the "train"
"test": {***}, # same as the "train"
}
For MOSI and MOSEI, the pre-extracted text features are from BERT, different from the original glove features in the CMU-Multimodal-SDK.
For SIMS, if you want to extract features from raw videos, you need to install Openface Toolkits first, and then refer our codes in the data/DataPre.py
.
python data/DataPre.py --data_dir [path_to_Dataset] --language ** --openface2Path [path_to_FeatureExtraction]
For bert models, you also can download Bert-Base, Chinese from Google-Bert. And then, convert tensorflow into pytorch using transformers-cli
Then, modify config/config_*.py
to update dataset pathes.
Run
python run.py --modelName *** --datasetName ***
Paper
- CH-SIMS: A Chinese Multimodal Sentiment Analysis Dataset with Fine-grained Annotations of Modality
- Learning Modality-Specific Representations with Self-Supervised Multi-Task Learning for Multimodal Sentiment Analysis
Please cite our paper if you find our work useful for your research:
@inproceedings{yu2020ch,
title={CH-SIMS: A Chinese Multimodal Sentiment Analysis Dataset with Fine-grained Annotation of Modality},
author={Yu, Wenmeng and Xu, Hua and Meng, Fanyang and Zhu, Yilin and Ma, Yixiao and Wu, Jiele and Zou, Jiyun and Yang, Kaicheng},
booktitle={Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics},
pages={3718--3727},
year={2020}
}
@article{yu2021learning,
title={Learning Modality-Specific Representations with Self-Supervised Multi-Task Learning for Multimodal Sentiment Analysis},
author={Yu, Wenmeng and Xu, Hua and Yuan, Ziqi and Wu, Jiele},
journal={arXiv preprint arXiv:2102.04830},
year={2021}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.