Skip to main content

ast - Pytorch

Project description

Multi-Modality

AST

Implementation of AST from the paper: "AST: Audio Spectrogram Transformer' in PyTorch and Zeta. In this implementation we basically take an 2d input tensor representing audio -> then patchify it -> linear proj -> then position embeddings -> then attention and feedforward in a loop for layers. Please Join Agora and tag me if this could be improved in any capacity.

Install

pip3 install ast-torch

Usage

import torch
from ast_torch.model import ASTransformer

# Create dummy data
x = torch.randn(2, 16)

# Initialize model
model = ASTransformer(
    dim=4, seqlen=16, dim_head=4, heads=4, depth=2, patch_size=4
)

# Run model and print output shape
print(model(x).shape)

Citation

@misc{gong2021ast,
    title={AST: Audio Spectrogram Transformer}, 
    author={Yuan Gong and Yu-An Chung and James Glass},
    year={2021},
    eprint={2104.01778},
    archivePrefix={arXiv},
    primaryClass={cs.SD}
}

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ast_torch-0.0.5.tar.gz (6.9 kB view hashes)

Uploaded Source

Built Distribution

ast_torch-0.0.5-py3-none-any.whl (7.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page