Skip to main content

Deep learning for Vietnamese Text to Speech

Project description

NViXTTS

NViXTTS is a library for advanced Text-to-Speech generation modified based on project TTS==0.22.0

Replace for

pip install --use-deprecated=legacy-resolver -e ./TTS
from NViXTTS.tts.configs.xtts_config import XttsConfig
from NViXTTS.tts.models.xtts import Xtts

Or

from NViXTTS import XttsConfig
from NViXTTS import Xtts

Note:

I'm just a contributor by modifying the library to make it easier to use for my code, so the copyright must still follow link

Directory structure

Root path: ./NViXTTS

- .models.json
- api.py
- bin/
  - collect_env_info.py
  - compute_attention_masks.py
  - compute_embeddings.py
  - compute_statistics.py
  - eval_encoder.py
  - extract_tts_spectrograms.py
  - find_unique_chars.py
  - find_unique_phonemes.py
  - remove_silence_using_vad.py
  - resample.py
  - synthesize.py
  - train_encoder.py
  - train_tts.py
  - train_vocoder.py
  - tune_wavegrad.py
  - __init__.py
- config/
  - shared_configs.py
  - __init__.py
  - __pycache__/
    - shared_configs.cpython-310.pyc
    - __init__.cpython-310.pyc
- demos/
  - xtts_ft_demo/
    - requirements.txt
    - utils/
      - formatter.py
      - gpt_train.py
    - xtts_demo.py
- encoder/
  - configs/
    - base_encoder_config.py
    - emotion_encoder_config.py
    - speaker_encoder_config.py
  - dataset.py
  - losses.py
  - models/
    - base_encoder.py
    - lstm.py
    - resnet.py
    - __init__.py
    - __pycache__/
      - base_encoder.cpython-310.pyc
      - lstm.cpython-310.pyc
      - resnet.cpython-310.pyc
  - README.md
  - requirements.txt
  - utils/
    - generic_utils.py
    - prepare_voxceleb.py
    - training.py
    - visual.py
    - __init__.py
    - __pycache__/
      - generic_utils.cpython-310.pyc
      - __init__.cpython-310.pyc
  - __init__.py
  - __pycache__/
    - losses.cpython-310.pyc
    - __init__.cpython-310.pyc
- model.py
- server/
  - conf.json
  - README.md
  - server.py
  - static/
    - coqui-log-green-TTS.png
  - templates/
    - details.html
    - index.html
  - __init__.py
- tts/
  - configs/
    - align_tts_config.py
    - bark_config.py
    - delightful_tts_config.py
    - fastspeech2_config.py
    - fast_pitch_config.py
    - fast_speech_config.py
    - glow_tts_config.py
    - neuralhmm_tts_config.py
    - overflow_config.py
    - shared_configs.py
    - speedy_speech_config.py
    - tacotron2_config.py
    - tacotron_config.py
    - tortoise_config.py
    - vits_config.py
    - xtts_config.py
    - __init__.py
    - __pycache__/
      - shared_configs.cpython-310.pyc
      - xtts_config.cpython-310.pyc
      - __init__.cpython-310.pyc
  - datasets/
    - dataset.py
    - formatters.py
    - __init__.py
    - __pycache__/
      - dataset.cpython-310.pyc
      - formatters.cpython-310.pyc
      - __init__.cpython-310.pyc
  - layers/
    - align_tts/
      - duration_predictor.py
      - mdn.py
      - __init__.py
    - bark/
      - hubert/
        - hubert_manager.py
        - kmeans_hubert.py
        - tokenizer.py
        - __init__.py
      - inference_funcs.py
      - load_model.py
      - model.py
      - model_fine.py
      - __init__.py
    - delightful_tts/
      - acoustic_model.py
      - conformer.py
      - conv_layers.py
      - encoders.py
      - energy_adaptor.py
      - kernel_predictor.py
      - networks.py
      - phoneme_prosody_predictor.py
      - pitch_adaptor.py
      - variance_predictor.py
      - __init__.py
    - feed_forward/
      - decoder.py
      - duration_predictor.py
      - encoder.py
      - __init__.py
    - generic/
      - aligner.py
      - gated_conv.py
      - normalization.py
      - pos_encoding.py
      - res_conv_bn.py
      - time_depth_sep_conv.py
      - transformer.py
      - wavenet.py
      - __init__.py
    - glow_tts/
      - decoder.py
      - duration_predictor.py
      - encoder.py
      - glow.py
      - transformer.py
      - __init__.py
    - losses.py
    - overflow/
      - common_layers.py
      - decoder.py
      - neural_hmm.py
      - plotting_utils.py
      - __init__.py
    - tacotron/
      - attentions.py
      - capacitron_layers.py
      - common_layers.py
      - gst_layers.py
      - tacotron.py
      - tacotron2.py
      - __init__.py
    - tortoise/
      - arch_utils.py
      - audio_utils.py
      - autoregressive.py
      - classifier.py
      - clvp.py
      - diffusion.py
      - diffusion_decoder.py
      - dpm_solver.py
      - random_latent_generator.py
      - tokenizer.py
      - transformer.py
      - utils.py
      - vocoder.py
      - wav2vec_alignment.py
      - xtransformers.py
    - vits/
      - discriminator.py
      - networks.py
      - stochastic_duration_predictor.py
      - transforms.py
    - xtts/
      - dvae.py
      - gpt.py
      - gpt_inference.py
      - hifigan_decoder.py
      - latent_encoder.py
      - perceiver_encoder.py
      - stream_generator.py
      - tokenizer.py
      - trainer/
        - dataset.py
        - gpt_trainer.py
      - xtts_manager.py
      - zh_num2words.py
      - __init__.py
      - __pycache__/
        - gpt.cpython-310.pyc
        - gpt_inference.cpython-310.pyc
        - hifigan_decoder.cpython-310.pyc
        - latent_encoder.cpython-310.pyc
        - perceiver_encoder.cpython-310.pyc
        - stream_generator.cpython-310.pyc
        - tokenizer.cpython-310.pyc
        - xtts_manager.cpython-310.pyc
        - zh_num2words.cpython-310.pyc
    - __init__.py
    - __pycache__/
      - losses.cpython-310.pyc
      - __init__.cpython-310.pyc
  - models/
    - align_tts.py
    - bark.py
    - base_tacotron.py
    - base_tts.py
    - delightful_tts.py
    - forward_tts.py
    - glow_tts.py
    - neuralhmm_tts.py
    - overflow.py
    - tacotron.py
    - tacotron2.py
    - tortoise.py
    - vits.py
    - xtts.py
    - __init__.py
    - __pycache__/
      - base_tts.cpython-310.pyc
      - xtts.cpython-310.pyc
      - __init__.cpython-310.pyc
  - utils/
    - assets/
      - tortoise/
        - tokenizer.json
    - data.py
    - fairseq.py
    - helpers.py
    - languages.py
    - managers.py
    - measures.py
    - monotonic_align/
      - core.c
      - core.pyx
      - setup.py
      - __init__.py
      - __pycache__/
        - __init__.cpython-310.pyc
    - speakers.py
    - ssim.py
    - synthesis.py
    - text/
      - bangla/
        - phonemizer.py
        - __init__.py
      - belarusian/
        - phonemizer.py
        - __init__.py
      - characters.py
      - chinese_mandarin/
        - numbers.py
        - phonemizer.py
        - pinyinToPhonemes.py
        - __init__.py
      - cleaners.py
      - cmudict.py
      - english/
        - abbreviations.py
        - number_norm.py
        - time_norm.py
        - __init__.py
      - french/
        - abbreviations.py
        - __init__.py
      - japanese/
        - phonemizer.py
        - __init__.py
      - korean/
        - korean.py
        - ko_dictionary.py
        - phonemizer.py
        - __init__.py
      - phonemizers/
        - bangla_phonemizer.py
        - base.py
        - belarusian_phonemizer.py
        - espeak_wrapper.py
        - gruut_wrapper.py
        - ja_jp_phonemizer.py
        - ko_kr_phonemizer.py
        - multi_phonemizer.py
        - zh_cn_phonemizer.py
        - __init__.py
      - punctuation.py
      - tokenizer.py
      - __init__.py
    - visual.py
    - __init__.py
    - __pycache__/
      - data.cpython-310.pyc
      - helpers.cpython-310.pyc
      - languages.cpython-310.pyc
      - managers.cpython-310.pyc
      - speakers.cpython-310.pyc
      - ssim.cpython-310.pyc
      - synthesis.cpython-310.pyc
      - visual.cpython-310.pyc
      - __init__.cpython-310.pyc
  - __init__.py
  - __pycache__/
    - __init__.cpython-310.pyc
- utils/
  - audio/
    - numpy_transforms.py
    - processor.py
    - torch_transforms.py
    - __init__.py
    - __pycache__/
      - numpy_transforms.cpython-310.pyc
      - processor.cpython-310.pyc
      - torch_transforms.cpython-310.pyc
      - __init__.cpython-310.pyc
  - callbacks.py
  - capacitron_optimizer.py
  - distribute.py
  - download.py
  - downloaders.py
  - generic_utils.py
  - io.py
  - manage.py
  - radam.py
  - samplers.py
  - synthesizer.py
  - training.py
  - vad.py
  - __init__.py
  - __pycache__/
    - generic_utils.cpython-310.pyc
    - io.cpython-310.pyc
    - __init__.cpython-310.pyc
- vc/
  - configs/
    - freevc_config.py
    - shared_configs.py
    - __init__.py
  - models/
    - base_vc.py
    - freevc.py
    - __init__.py
  - modules/
    - freevc/
      - commons.py
      - mel_processing.py
      - modules.py
      - speaker_encoder/
        - audio.py
        - hparams.py
        - speaker_encoder.py
        - __init__.py
      - wavlm/
        - config.json
        - modules.py
        - wavlm.py
        - __init__.py
      - __init__.py
    - __init__.py
- VERSION
- vocoder/
  - configs/
    - fullband_melgan_config.py
    - hifigan_config.py
    - melgan_config.py
    - multiband_melgan_config.py
    - parallel_wavegan_config.py
    - shared_configs.py
    - univnet_config.py
    - wavegrad_config.py
    - wavernn_config.py
    - __init__.py
  - datasets/
    - gan_dataset.py
    - preprocess.py
    - wavegrad_dataset.py
    - wavernn_dataset.py
    - __init__.py
  - layers/
    - hifigan.py
    - losses.py
    - lvc_block.py
    - melgan.py
    - parallel_wavegan.py
    - pqmf.py
    - qmf.dat
    - upsample.py
    - wavegrad.py
    - __init__.py
  - models/
    - base_vocoder.py
    - fullband_melgan_generator.py
    - gan.py
    - hifigan_discriminator.py
    - hifigan_generator.py
    - melgan_discriminator.py
    - melgan_generator.py
    - melgan_multiscale_discriminator.py
    - multiband_melgan_generator.py
    - parallel_wavegan_discriminator.py
    - parallel_wavegan_generator.py
    - random_window_discriminator.py
    - univnet_discriminator.py
    - univnet_generator.py
    - wavegrad.py
    - wavernn.py
    - __init__.py
  - pqmf_output.wav
  - README.md
  - utils/
    - distribution.py
    - generic_utils.py
    - __init__.py
  - __init__.py
- __init__.py
- __pycache__/
  - model.cpython-310.pyc
  - __init__.cpython-310.pyc

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

NViXTTS-0.0.2-cp311-cp311-win_amd64.whl (797.6 kB view details)

Uploaded CPython 3.11 Windows x86-64

File details

Details for the file NViXTTS-0.0.2-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: NViXTTS-0.0.2-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 797.6 kB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.10.12

File hashes

Hashes for NViXTTS-0.0.2-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 0596cb1b8fd983ded05bd8b10ec92efdbbed86ce6ccd5cd651375a7b8d19912a
MD5 54e3a9f6a163c35a5b4c69cdd500a640
BLAKE2b-256 47cfc28ce77036e3f4e30361f258cef800a19df0758b661887356da20459f2c6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page