Skip to main content

Dia-JAX: A JAX port of Dia, the text-to-speech model for generating realistic dialogue from text with emotion and tone control.

Project description

Dia-JAX

An experimental JAX port of Dia, the 1.6B parameter text-to-speech model from Nari Labs

Dia-JAX is a work-in-progress port of the original PyTorch-based Dia model to JAX via Flax NNX.

Features

Just like the original Dia model, Dia-JAX aims to offer:

  • Generate dialogue via [S1] and [S2] tags
  • Generate non-verbal elements like (laughs), (coughs), etc.
    • Supported verbal tags: (laughs), (clears throat), (sighs), (gasps), (coughs), (singing), (sings), (mumbles), (beep), (groans), (sniffs), (claps), (screams), (inhales), (exhales), (applause), (burps), (humming), (sneezes), (chuckle), (whistles)
  • Voice cloning with reference audio (TODO: currently not implemented)
  • Quality comparable to commercial solutions like ElevenLabs Studio

Quickstart

Install via pip

pip install diajax

⚙️ Usage

Note: Currently only recommended for experimental/development use due to memory issues

Run from Command Line

# Generate audio with default settings
dia --text "[S1] Dear Jacks, to generate audio from text from any machine. (applause) [S2] Really? How! (screams) [S1] With flakes and an axe. (chuckles)"

# Or with custom parameters
dia --temperature 0.7 --cfg-filter-top-k 42

As a Python Library

import diajax
model, config = diajax.load()
output = diajax.generate(model, config, text)

import soundfile as sf
sf.write('dia.mp3', output, 44100)

Acknowledgments

This project is a port of the original Dia model by Nari Labs. We thank them for releasing their model and code, which made this port possible.

License

This project is licensed under the same terms as the original Dia model. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

diajax-0.0.2a0.tar.gz (15.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

diajax-0.0.2a0-py3-none-any.whl (14.7 kB view details)

Uploaded Python 3

File details

Details for the file diajax-0.0.2a0.tar.gz.

File metadata

  • Download URL: diajax-0.0.2a0.tar.gz
  • Upload date:
  • Size: 15.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for diajax-0.0.2a0.tar.gz
Algorithm Hash digest
SHA256 8a6ea12fb7cec43aa5b02ec5860df64c09f61a3e39d4b294624980819990eb99
MD5 d820ce839cf5e13b4dd92b67e6d3554d
BLAKE2b-256 40cd1c5f68b03b17902c738c7bd27709c2c49a5131a775bdef566b78fe67c3e7

See more details on using hashes here.

File details

Details for the file diajax-0.0.2a0-py3-none-any.whl.

File metadata

  • Download URL: diajax-0.0.2a0-py3-none-any.whl
  • Upload date:
  • Size: 14.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for diajax-0.0.2a0-py3-none-any.whl
Algorithm Hash digest
SHA256 aa3575b9231af613ba1c9b6cab9a8626efbfb3ce905980e0f659ec46ec61e0e5
MD5 5112be11eba6151142968d3d7f0ab6b4
BLAKE2b-256 bf933fa8566f90048c5fdc0e135d730e9575642d9ab5fd646fa02b7e8542f888

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page