This package is written for text-to-audio/music generation.
Project description
AudioLDM 2
This repo currently support Text-to-Audio Generation (including Music)
Web APP
- Prepare running environment
conda create -n audioldm python=3.8; conda activate audioldm
pip3 install audioldm
git clone https://github.com/haoheliu/AudioLDM2; cd AudioLDM2
- Start the web application (powered by Gradio)
python3 app.py
- A link will be printed out. Click the link to open the browser and play.
Commandline Usage
Prepare running environment
# Optional
conda create -n audioldm python=3.8; conda activate audioldm
# Install AudioLDM
pip3 install git+https://github.com/haoheliu/AudioLDM2.git
- Generate based on a text prompt
audioldm2 -t "Musical constellations twinkling in the night sky, forming a cosmic melody."
- Generate based on a list of text
audioldm2 -tl batch.lst
Random Seed Matters
Sometimes model may not perform well (sounds wired or low quality) when changing into a different hardware. In this case, please adjust the random seed and find the optimal one for your hardware.
audioldm2 --seed 1234 -t "Musical constellations twinkling in the night sky, forming a cosmic melody."
Pretrained Models
You can choose model checkpoint by setting up "model_name":
audioldm2 --model_name "audioldm2-full-large-650k" -t "Musical constellations twinkling in the night sky, forming a cosmic melody."
We have three checkpoints you can choose for now:
- audioldm2-full (default): This checkpoint can perform both sound effect and music generation.
- audioldm2-music-665k: This checkpoint is specialized on music generation.
- audioldm2-full-large-650k: This checkpoint is the larger version of audioldm2-full.
Evaluation result on AudioCaps and MusicCaps evaluation set:
Coming soon.
Cite this work
If you found this tool useful, please consider citing
AudioLDM 2 paper coming soon
@article{liu2023audioldm,
title={AudioLDM: Text-to-Audio Generation with Latent Diffusion Models},
author={Liu, Haohe and Chen, Zehua and Yuan, Yi and Mei, Xinhao and Liu, Xubo and Mandic, Danilo and Wang, Wenwu and Plumbley, Mark D},
journal={arXiv preprint arXiv:2301.12503},
year={2023}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
audioldm2-0.0.6.tar.gz
(2.9 MB
view hashes)
Built Distribution
Close
Hashes for audioldm2-0.0.6-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0db44e172fb34f4e09089fba657d5cf16a7a54d9f7d6dbd5947931d5895a7c69 |
|
MD5 | 260cea491692999b14fbca98050584d2 |
|
BLAKE2b-256 | 81089a1d6b538eed2c749b6d79329a7c4224b8a6fd2a62b9e8ef5a8c9c00d381 |