Skip to main content

A Python wrapper for YarnGPT text-to-speech model with multi-language support

Project description

YarnGPT Python Wrapper Library

Version 0.2.0

  • Added support for local speakers
  • Added support for multiple languages
  • Added support for multiple languages

Description

YarnGPT is a Python wrapper for the YarnGPT text-to-speech model, designed to synthesize natural Nigerian speech in multiple languages using a pure language modeling approach. This library provides a simple API to convert text into audio output, allowing users to select from various preset voices, languages, and adjust generation parameters.

Features

  • Supports multiple Nigerian languages: English, Yoruba, Igbo, and Hausa
  • Rich set of voices for each language:
    • English: idera, chinenye, jude, emma, umar, joke, zainab, osagie, remi, tayo
    • Yoruba: abayomi, aisha, folake
    • Igbo: chioma, obinna, adanna
    • Hausa: amina, fatima, ibrahim, yusuf
  • Utilizes Hugging Face's model caching for efficient model loading
  • Exposes a straightforward API function: generate_speech(text, speaker, language, temperature, repetition_penalty, max_length)
  • Allows customization of generation parameters
  • Includes unit tests to ensure core functionality

Installation

  1. Create and activate a virtual environment:

    • On Linux/MacOS:
    python3 -m venv env
    source env/bin/activate
    
    • On Windows:
    python -m venv env
    env\Scripts\activate
    
  2. Install the package:

    pip install yarngpt
    

Usage

Basic usage to generate and save audio:

from yarngpt import generate_speech
import torchaudio

# Generate English speech with default speaker
audio = generate_speech("Hello, this is a test.", language="english")

# Generate Yoruba speech with a Yoruba voice
audio = generate_speech("Bawo ni?", speaker="abayomi", language="yoruba")

# Save the generated audio
torchaudio.save("output.wav", audio, sample_rate=24000)

For Jupyter Notebook users, you can also play the audio directly:

from yarngpt import generate_speech
import torchaudio
from IPython.display import Audio

# Generate speech in different languages
english_audio = generate_speech("Hello!", speaker="idera", language="english")
yoruba_audio = generate_speech("Bawo ni?", speaker="abayomi", language="yoruba")
igbo_audio = generate_speech("Kedu?", speaker="chioma", language="igbo")
hausa_audio = generate_speech("Sannu!", speaker="amina", language="hausa")

# Save and play the audio
torchaudio.save("output.wav", english_audio, sample_rate=24000)
Audio("output.wav")

Parameter Options

  • text: The input string to convert to speech
  • speaker: Choose from available voices by language (see Features section for full list)
  • language: The language for speech generation ("english", "yoruba", "igbo", "hausa")
  • temperature: Controls the randomness of generation (default is 0.1)
  • repetition_penalty: A factor to reduce repetitive output (default is 1.1)
  • max_length: The maximum length of the generated output tokens (default is 4000)

Testing

Run the unit tests to verify functionality:

python -m unittest discover -s tests

License

This project is licensed under the MIT License.

Acknowledgments

  • Built as a contribution to yarngpt projects
  • Utilizes Hugging Face's model caching and the transformers library
  • Special thanks to the open-source community for their ongoing support

For more details and documentation, visit the GitHub repository: https://github.com/jerryola1

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yarngpt-0.2.0.tar.gz (29.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

yarngpt-0.2.0-py3-none-any.whl (35.0 kB view details)

Uploaded Python 3

File details

Details for the file yarngpt-0.2.0.tar.gz.

File metadata

  • Download URL: yarngpt-0.2.0.tar.gz
  • Upload date:
  • Size: 29.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for yarngpt-0.2.0.tar.gz
Algorithm Hash digest
SHA256 118a52c11eb65d7c8a1715c5bbbdfc80ad4a6484f9e6ca2a8ce132013332871a
MD5 524f904b01da3e0151662cf920ea0bef
BLAKE2b-256 ef27118c638902bc7d777e80ae4fc330ff9050f4404f516d93bf3448533154e7

See more details on using hashes here.

File details

Details for the file yarngpt-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: yarngpt-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 35.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for yarngpt-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c51421233e527a4590262ef8499f67dcea6c9ee5d9b2df46b0f1ee17fa667926
MD5 00497d005b4028c146fec86fc4d1537a
BLAKE2b-256 9846e475767d10e66a75aa5406c620a5591cbde271794416b02bebd95bcebfe7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page