Skip to main content

tokensynth

Project description

TokenSynth: A Token-based Neural Synthesizer for Instrument Cloning and Text-to-Instrument

This is the official implementation of "TokenSynth: A Token-based Neural Synthesizer for Instrument Cloning and Text-to-Instrument", accepted to ICASSP 2025 (in press).

Installation

To install TokenSynth, simply run:

pip install tokensynth

Quickstart

from tokensynth import TokenSynth, CLAP, DACDecoder
import audiofile
import torch

# Set file paths
ref_audio = "media/reference_audio.wav"
midi = "media/input_midi.mid"

# Initialize models
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
synth = TokenSynth.from_pretrained(aug=True)
clap = CLAP(device=device)
decoder = DACDecoder(device=device)

with torch.no_grad():
    # Extract timbre embeddings from audio and text
    timbre_audio = clap.encode_audio(ref_audio)
    timbre_text = clap.encode_text("warm smooth electronic bass")
    timbre_audio_text = 0.5 * timbre_audio + 0.5 * timbre_text

    # Generate audio tokens
    tokens_audio = synth.synthesize(timbre_audio, midi, top_k=10)
    tokens_text = synth.synthesize(timbre_text, midi, top_p=0.6, guidance_scale=1.6)
    tokens_audio_text = synth.synthesize(timbre_audio_text, midi, top_p=0.6, guidance_scale=1.6)

    # Decode tokens into audio waveforms
    audio_audio = decoder.decode(tokens_audio) 
    audio_text = decoder.decode(tokens_text)
    audio_audio_text = decoder.decode(tokens_audio_text)

# Save audio files
audiofile.write("media/output_audio.wav", audio_audio.cpu().numpy(), 16000)
audiofile.write("media/output_text.wav", audio_text.cpu().numpy(), 16000)
audiofile.write("media/output_audio_text.wav", audio_audio_text.cpu().numpy(), 16000)

You can also run python quickstart.py from the project root directory.

Citation

A formal citation (BibTeX) will be available once this work is published.

For now, please cite this repository as:

Kyungsu Kim, Junghyun Koo, Sungho Lee, Haesun Joung, Kyogu Lee.
TokenSynth: A Token-Based Neural Synthesizer for Instrument Cloning and Text-to-Instrument.
GitHub repository, 2024. Available at: https://github.com/kyungsukim42/tokensynth

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokensynth-0.0.2.tar.gz (12.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tokensynth-0.0.2-py3-none-any.whl (12.8 kB view details)

Uploaded Python 3

File details

Details for the file tokensynth-0.0.2.tar.gz.

File metadata

  • Download URL: tokensynth-0.0.2.tar.gz
  • Upload date:
  • Size: 12.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.16

File hashes

Hashes for tokensynth-0.0.2.tar.gz
Algorithm Hash digest
SHA256 a9dc6de15c38c4cff111f643453c1880f41d55e6a9238e1f336b9875595e858f
MD5 e6850f3e84f482d1d9a7325fb8991094
BLAKE2b-256 c7acaa1f2f88c00f3a4438c2334a21bc1d12b93b7d5d00cf07546aedaabd8fa3

See more details on using hashes here.

File details

Details for the file tokensynth-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: tokensynth-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 12.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.16

File hashes

Hashes for tokensynth-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 66b3fc523e1e04ce174472343cb9343cf35f627f53a2b451854920f06338dfca
MD5 c677bd7dd9f6e9d5f5b533421932e0b8
BLAKE2b-256 c75f41c9a68ea290fd0941a3aea6eaa1f3b520d49dd601f055ef943ac76d052d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page