A Python library to extract, store and fuse multimodal features for deep learning
Project description
MMKit-Features: Multimodal Features Extraction Toolkit
A light-weight Python library to extract, fuse and store multimodal features for deep learning.
Objectives
- To extract, store and fuse various features from multimodal datasets rapidly and efficiently;
- To provide a common multimodal information processing framework for multimodal features;
- To achieve generative adversarial network (GAN)-based multimodal knowledge representation dynamically.
Framework
Modalities
- Text/Language modality
- Image modality
- Video modality
- Speech/sound modality
- Cross-modality between above
Usage
A toy example showing how to build a multimodal feature (MMF) library is here:
from mmkfeatures.fusion.mm_features_lib import MMFeaturesLib
from mmkfeatures.fusion.mm_features_node import MMFeaturesNode
import numpy as np
if __name__ == "__main__":
# 1. create an empty multimodal features library with root and dataset names
feature_lib=MMFeaturesLib(root_name="test features",dataset_name="test_features")
# 2. set short names for each dimension for convenience
feature_lib.set_features_name(["feature1","feature2","feature3"])
# 3. set a list of content IDs
content_ids=["content1","content2","content3"]
# 4. according to IDs, assign a group of features with interval to corresponding content ID
features_dict={}
for id in content_ids:
mmf_node=MMFeaturesNode(id)
mmf_node.set_item("name",str(id))
mmf_node.set_item("features",np.array([[1,2,3]]))
mmf_node.set_item("intervals",np.array([[0,1]]))
features_dict[id]=mmf_node
# 5. set the library's data
feature_lib.set_data(features_dict)
# 6. save the features to disk for future use
feature_lib.save_data("test6_feature.csd")
# 7. check structure of lib file with the format of h5py
feature_lib.show_structure("test6_feature.csd")
# 8. have a glance of features content within the dataset
feature_lib.show_sample_data("test6_feature.csd")
Further instructions on the toolkit refers to here.
Credits
The project includes some source codes from various open-source contributors. Here is a list of their contributions.
- A2Zadeh/CMU-MultimodalSDK
- aishoot/Speech_Feature_Extraction
- antoine77340/video_feature_extractor
- jgoodman8/py-image-features-extractor
- v-iashin/Video Features
License
Please cite our project if the project is used in your research.
Chen, D. (2022). MMKit-Features: Multimodal Features Extraction Toolkit (Version 0.0.1) [Computer software]
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
mmkit-features-0.0.1a0.tar.gz
(94.9 kB
view hashes)
Built Distribution
Close
Hashes for mmkit_features-0.0.1a0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1b9fe9e66d7fbedc28b35d9c0de0176cb1c124d6cb2d025f72dc53c45fb71ac9 |
|
MD5 | 07e26996581d625bc5511c459163b586 |
|
BLAKE2b-256 | 9b9efd333cf9249db6b8242df441033afe9f872f5a6ef488df0f61acb91990e8 |