A Python library to extract, store and fuse multimodal features for deep learning
Project description
MMKit-Features: Multimodal Features Extraction Toolkit
A light-weight Python library to extract, fuse and store multimodal features for deep learning.
Objectives
- To extract, store and fuse various features from multimodal datasets rapidly and efficiently;
- To provide a common multimodal information processing framework for multimodal features;
- To achieve generative adversarial network (GAN)-based multimodal knowledge representation dynamically.
Framework
Modalities
- Text/Language modality
- Image modality
- Video modality
- Speech/sound modality
- Cross-modality between above
Usage
A toy example showing how to build a multimodal feature (MMF) library is here:
from mmkfeatures.fusion.mm_features_lib import MMFeaturesLib
from mmkfeatures.fusion.mm_features_node import MMFeaturesNode
import numpy as np
if __name__ == "__main__":
# 1. create an empty multimodal features library with root and dataset names
feature_lib=MMFeaturesLib(root_name="test features",dataset_name="test_features")
# 2. set short names for each dimension for convenience
feature_lib.set_features_name(["feature1","feature2","feature3"])
# 3. set a list of content IDs
content_ids=["content1","content2","content3"]
# 4. according to IDs, assign a group of features with interval to corresponding content ID
features_dict={}
for id in content_ids:
mmf_node=MMFeaturesNode(id)
mmf_node.set_item("name",str(id))
mmf_node.set_item("features",np.array([[1,2,3]]))
mmf_node.set_item("intervals",np.array([[0,1]]))
features_dict[id]=mmf_node
# 5. set the library's data
feature_lib.set_data(features_dict)
# 6. save the features to disk for future use
feature_lib.save_data("test6_feature.csd")
# 7. check structure of lib file with the format of h5py
feature_lib.show_structure("test6_feature.csd")
# 8. have a glance of features content within the dataset
feature_lib.show_sample_data("test6_feature.csd")
Further instructions on the toolkit refers to here.
Credits
The project includes some source codes from various open-source contributors. Here is a list of their contributions.
- A2Zadeh/CMU-MultimodalSDK
- aishoot/Speech_Feature_Extraction
- antoine77340/video_feature_extractor
- jgoodman8/py-image-features-extractor
- v-iashin/Video Features
License
Please cite our project if the project is used in your research.
Chen, D. (2022). MMKit-Features: Multimodal Features Extraction Toolkit (Version 0.0.1) [Computer software]
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file mmkit-features-0.0.1a0.tar.gz
.
File metadata
- Download URL: mmkit-features-0.0.1a0.tar.gz
- Upload date:
- Size: 94.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.0 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.11
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f07771c43a8bae1947bdbfa13fc4d275bd480472de6856d3ca9389d7d1f161ac |
|
MD5 | 67d77f9f264cf7a30c7eaee0d06236fd |
|
BLAKE2b-256 | b663e5f6e9332cd0a48a558dcb9d7e2ba45a65f81754f26b76fbdbb001ace0c8 |
File details
Details for the file mmkit_features-0.0.1a0-py3-none-any.whl
.
File metadata
- Download URL: mmkit_features-0.0.1a0-py3-none-any.whl
- Upload date:
- Size: 124.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.0 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.11
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1b9fe9e66d7fbedc28b35d9c0de0176cb1c124d6cb2d025f72dc53c45fb71ac9 |
|
MD5 | 07e26996581d625bc5511c459163b586 |
|
BLAKE2b-256 | 9b9efd333cf9249db6b8242df441033afe9f872f5a6ef488df0f61acb91990e8 |