Skip to main content

Dia-JAX: A JAX port of Dia, the text-to-speech model for generating realistic dialogue from text with emotion and tone control.

Project description

Dia-JAX

An experimental JAX port of Dia, the 1.6B parameter text-to-speech model from Nari Labs

Dia-JAX is a work-in-progress port of the original PyTorch-based Dia model to JAX via Flax NNX.

Features

Just like the original Dia model, Dia-JAX aims to offer:

  • Generate dialogue via [S1] and [S2] tags
  • Generate non-verbal elements like (laughs), (coughs), etc.
    • Supported verbal tags: (laughs), (clears throat), (sighs), (gasps), (coughs), (singing), (sings), (mumbles), (beep), (groans), (sniffs), (claps), (screams), (inhales), (exhales), (applause), (burps), (humming), (sneezes), (chuckle), (whistles)
  • Voice cloning with reference audio (TODO: currently not implemented)
  • Quality comparable to commercial solutions like ElevenLabs Studio

Quickstart

Install via pip

pip install diajax

⚙️ Usage

Note: Currently only recommended for experimental/development use due to memory issues

Run from Command Line

# Generate audio with default settings
dia --text "[S1] Dear Jacks, to generate audio from text from any machine. (applause) [S2] Really? How! (screams) [S1] With flakes and an axe. (chuckles)"

# Or with custom parameters
dia --temperature 0.7 --cfg-filter-top-k 42

As a Python Library

import diajax
model, config = diajax.load()
output = diajax.generate(model, config, text)

import soundfile as sf
sf.write('dia.mp3', output, 44100)

Acknowledgments

This project is a port of the original Dia model by Nari Labs. We thank them for releasing their model and code, which made this port possible.

License

This project is licensed under the same terms as the original Dia model. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

diajax-0.0.2a1.tar.gz (15.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

diajax-0.0.2a1-py3-none-any.whl (14.7 kB view details)

Uploaded Python 3

File details

Details for the file diajax-0.0.2a1.tar.gz.

File metadata

  • Download URL: diajax-0.0.2a1.tar.gz
  • Upload date:
  • Size: 15.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for diajax-0.0.2a1.tar.gz
Algorithm Hash digest
SHA256 559dc86fd7b8cb5118fa2ab0681df450259fbac75903c73f72f6c6a76be97c98
MD5 0d2f0e4df39229f27cfa87ed20a003db
BLAKE2b-256 d78ad352d565418eea6704d3ee517f0b92b7dc5efb30f09a6dd9b783cc09937a

See more details on using hashes here.

File details

Details for the file diajax-0.0.2a1-py3-none-any.whl.

File metadata

  • Download URL: diajax-0.0.2a1-py3-none-any.whl
  • Upload date:
  • Size: 14.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for diajax-0.0.2a1-py3-none-any.whl
Algorithm Hash digest
SHA256 6f88c686ce2ec3861a6e66c568d55818b0b57a3e28edf1fe41761d34080a9ec1
MD5 ae926334b66bfc8d5e39790eb3cca108
BLAKE2b-256 277b704f4a3f35334028c4e0a7347c4f6a339193519f6dcec2308d4d2ceac6d7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page