tungnaa is a text-to-voice model family and musical instrument

These details have not been verified by PyPI

Project description

Tungnaa Interactive Voice Instruments

Training and GUI inference for interactive artistic text-to-voice models.

Installation

pip install tungnaa[gui] (to use the instrument)

pip install tungnaa[train] (if you are installing on a server to train models)

Usage

tungnaa --help

Models from Huggingface

git clone git@hf.co:intelligent-instruments-Lab/tungnaa

Running with the Python Audio Engine

#todo model selection from the Tungnaa gui, including block size and sample rate that match system (currently not possible to have a SR mismatch)
#todo audio device selection from the Tungnaa gui

tungnaa run --tts models/tts/rtalign_044_jvs.ckpt --vocoder models/vocoder/rave3-jvs-warm200k-lobeta_c052f53b23_streaming.ts --audio-out default

Using SuperCollider, PureData or Max as Audio Engine

If --latent-audio switch is enabled, Tungnaa will stream RAVE latent trajectories over a single audio-rate channel, which can be piped into another audio engine running the RAVE vocoder. The piping can be done relatively easily on Linux using JACK, and on MacOS using Blackhole.

SuperCollider: sclang supercollider/rtvoice-demo.scd

tungnaa run --tts models/tts/rtalign_044_jvs.ckpt --latent_audio

Training Models

vocoder training

using victor-shepardson RAVE fork

example preprocessing with joining of short files (especially useful for datasets containing many short utterances)

rave preprocess \
--input_path /path/to/audio/directory \
--output_path /path/to/tmp/storage/myravedata \
--num_signal 150000 --sampling_rate 48000 \
--join_short_files

example transfer learning using IIL rave-models:

rave train --name 001-my-vocoder-name \
--config rave-models/voice_multi_b2048_r48000/config.gin --config transfer \
--db_path /path/to/tmp/storage/myravedata \
--out_path /path/to/rave/runs \
--transfer_ckpt rave-models/voice_multi_b2048_r48000/version_0/checkpoints/last.ckpt \
--n_signal 150000
--gpu 0

example export using sign normalization (latents correlate with louder/brighter sounds):

rave export --run /path/to/rave/runs/001-my-vocoder-name \
--streaming --normalize_sign --latent_size ...

Tungnaá preprocessing

see tungnaa prep --help.

To use datasets other than vctk or hifitts, it may be necessary to add an adapter function in prep.py.

example:

tungnaa prep \
--datasets '{kind:"vctk", path:"/path/to/VCTK"}' \
--rave-path /path/to/rave_streaming.ts \
--out-path /path/to/tmp/dataset_name

training

see tungnaa trainer --help

example:

tungnaa trainer --experiment 001-my-tts-name \
--model-dir /path/for/checkpoints \
--log-dir /path/for/logs \
--manifest /path/to/tmp/dataset_name/manifest.json \
--rave-model /path/to/rave_streaming.ts 
--lr 3e-4 --lr-text 3e-5 --epoch-size 200 --save-epochs 20 \
--device cuda:0 \
train

resume a stopped training: add --checkpoint /path/to/checkpoint

transfer learning: add --checkpoint /path/to/checkpoint --resume False

in-text annotations

--speaker_annotate prepends the speaker id determined during prepreprocessing. With --speaker_dataset, it includes the dataset.

--csv /path/to/file.csv accepts a jvs_labels_encoder_k7.csv-style CSV file. First column contains the audio filename without extension, second column contains an annotation to be prepended to text.

If you were to use all three options, you would get:

"csvval:[dataset:speaker] original text"

Developing

Poetry is used for packaging and dependency management. Conda is used for environments and Python version management, and may be replaced by virtualenv or similar.

cd tungnaa
conda create -n tungnaa python=3.12 ffmpeg
conda activate tungnaa
poetry install

Note that poetry should not be installed in the project environment, but rather from the system package manager, with pipx, or in a separate environment.

To add a dependency, use poetry add, or edit pyproject.toml and then run poetry lock; poetry install.

To add a model, use dvc add /path/to/model, then git add /path/to/model.dvc. Tungnaá models should go in models/tts/, and be accompanied by a model.md file. Vocoders should go in models/vocoders.

docs

run mkdocs serve to build and view documentation

run mkdocs gh-deploy to deploy to github pages

Non-Python Dependencies

Poetry (https://python-poetry.org/) for packaging

Python Dependencies

See pyproject.toml

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.2.0

Dec 14, 2025

0.1.2

Aug 19, 2025

0.1.1

May 22, 2025

This version

0.1.0

May 22, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tungnaa-0.1.0.tar.gz (96.0 kB view details)

Uploaded May 22, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tungnaa-0.1.0-py3-none-any.whl (96.7 kB view details)

Uploaded May 22, 2025 Python 3

File details

Details for the file tungnaa-0.1.0.tar.gz.

File metadata

Download URL: tungnaa-0.1.0.tar.gz
Upload date: May 22, 2025
Size: 96.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.7.1 CPython/3.9.13 Darwin/21.6.0

File hashes

Hashes for tungnaa-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`978f42527a55cf3a0f2c83055b0c471556fd2e3b59d7eec502279a02012caa0c`
MD5	`8af82978ccd4f70cb277d34fe3961c01`
BLAKE2b-256	`466337d0d39e98ab6a33c66923cb66b12d9dc452c9280f2b2c9f3d7b776e40e6`

See more details on using hashes here.

File details

Details for the file tungnaa-0.1.0-py3-none-any.whl.

File metadata

Download URL: tungnaa-0.1.0-py3-none-any.whl
Upload date: May 22, 2025
Size: 96.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.7.1 CPython/3.9.13 Darwin/21.6.0

File hashes

Hashes for tungnaa-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8049aa0b64e94242810fe46269aea7ad4c6861066c3342635590d2d26a067784`
MD5	`458ab55a58a4a92473e27a1b392236a7`
BLAKE2b-256	`e365b36d9334466f6439b66d475b3ecd84fa56a95026527929cb5c6abe7d3a8d`

See more details on using hashes here.

tungnaa 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Tungnaa Interactive Voice Instruments

Installation

Usage

Models from Huggingface

Running with the Python Audio Engine

Using SuperCollider, PureData or Max as Audio Engine

Training Models

vocoder training

Tungnaá preprocessing

training

in-text annotations

Developing

docs

Non-Python Dependencies

Python Dependencies

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes