Bambara Text-to-Speech system using Maliba-AI models

These details have not been verified by PyPI

Project links

Project description

MALIBA-AI Bambara TTS SDK

Welcome to the MALIBA-AI Bambara Text-to-Speech SDK! This is the first open-source TTS system specifically designed for the Bambara language. Whether you're building educational platforms, creating voice interfaces, or developing accessibility tools, this SDK provides professional grade Bambara speech synthesis with multiple authentic speakers.

About MALIBA-AI
Installation
Quick Start
Client Initialization
Basic Usage
- 5.1 Simple Text-to-Speech
- 5.2 Speaker Selection
- 5.3 Advanced Configuration
Speaker System
Generation Parameters
Audio Output Options
Examples
- 9.1 Educational Applications
- 9.2 Voice Interface Integration
- 9.3 Batch Processing
Best Practices
Performance Considerations
Troubleshooting
Contributing

About MALIBA-AI

MALIBA-AI is a community driven initiative focused on developing AI solutions for Mali while ensuring no Malian is left behind by technological advances. Our mission is to break down linguistic barriers by creating AI systems that understand and speak Mali's indigenous languages.

This Bambara TTS SDK represents a breakthrough in African language technology:

First open-source Bambara TTS: Pioneering accessible speech synthesis for Bambara
Highest accuracy available: State-of-the-art neural architecture fine-tuned specifically for Bambara
Production-ready: Built for real-world applications with optimized performance
Community-driven: Developed by and for the Bambara speaking community

Installation

Install the MALIBA-AI SDK using pip:

pip install maliba_ai

For development installations:

pip install git+https://github.com/MALIBA-AI/bambara-tts.git

Quick Start

Get started with Bambara TTS in under a minute:

from maliba_ai.tts import BambaraTTSInference
from maliba_ai.config.speakers import Bourama
import soundfile as sf

# Initialize the TTS system
tts = BambaraTTSInference()

# Generate speech from Bambara text
text = "I ni ce! N ye MALIBA-AI ye."  # "Hello! I am MALIBA-AI."
audio = tts.generate_speech(text, speaker_id=Bourama)

# Save the audio
sf.write("greeting.wav", audio, 16000)
print("Bambara speech generated successfully!")

Client Initialization

The TTS system initializes automatically with optimal settings:

from maliba_ai.tts import BambaraTTSInference

# Standard initialization
tts = BambaraTTSInference()

# With custom Hugging Face token (if needed)
tts = BambaraTTSInference(hf_token="your_token_here")

The system automatically:

Detects and uses GPU acceleration when available
Downloads required models on first use
Loads all available speaker voices
Optimizes memory usage for your hardware

Basic Usage

Simple Text-to-Speech

The most straightforward way to generate Bambara speech:

# Basic synthesis with default speaker (Adame)
text = "Bamanankan ye kan ɲuman ye"  # "Bambara is a beautiful language"
audio = tts.generate_speech(text)

Speaker Selection

Choose from our five authentic Bambara speakers:

from maliba_ai.config.speakers import Adame, Moussa, Bourama, Modibo, Seydou

# Use different speakers
speakers_demo = [
    (Adame, "I ni ce, n tɔgɔ ye Adame ye"),      # "Hello, my name is Adame"
    (Moussa, "N bɛ baara kɛ sɛnɛkɛla la"),      # "I work in agriculture"
    (Bourama, "Aw ye ɲɛnɛmako"),                # "Welcome"
    (Modibo, "Kalan ka di kosɛbɛ"),             # "Learning is very good"
    (Seydou, "Aw ka se ka baara in kɛ"),        # "You can do this work"
]

for speaker, text in speakers_demo:
    audio = tts.generate_speech(text, speaker_id=speaker)
    print(f"Generated speech for {speaker.id}")

Note: Bourama is our most stable and accurate speaker, recommended for production applications.

Advanced Configuration

Fine-tune synthesis with custom parameters (though defaults are optimized):

audio = tts.generate_speech(
    text="An ka baara kɛ ɲɔgɔn fɛ",           # "Let's work together"
    speaker_id=Bourama,
    temperature=0.8,                          # Keep at 0.8 for optimal results
    top_k=50,                                # Vocabulary sampling
    top_p=1.0,                               # Nucleus sampling
    max_new_audio_tokens=2048,               # Maximum audio length
    output_filename="collaboration.wav"       # Auto-save option
)

Speaker System

Our speaker system features five distinct Bambara voices:

from maliba_ai.config.speakers import Speakers

# View all available speakers
all_speakers = Speakers.get_all_speakers()
for speaker in all_speakers:
    print(f"Speaker: {speaker.id}")

# Get specific speaker by name
bourama = Speakers.get_speaker_by_name("Bourama")

# Speaker recommendations:
# - Bourama: Most stable and accurate (recommended for production)
# - Adame: Natural conversational tone
# - Moussa: Clear pronunciation
# - Modibo: Expressive delivery
# - Seydou: Balanced characteristics

Generation Parameters

The SDK comes with optimized default parameters that provide the best quality:

Default Configuration

# These defaults are optimized for Bambara - avoid changing unless necessary
temperature=0.8        # Optimal balance between consistency and naturalness
top_k=50              # Vocabulary selection size
top_p=1.0             # Nucleus sampling threshold
max_new_audio_tokens=2048  # Maximum audio sequence length

Parameter Guidelines

Temperature: Keep at 0.8. Lower values may sound robotic, higher values can introduce artifacts
Top-k/Top-p: Defaults provide best quality. Experimentation often reduces accuracy
Max tokens: Increase only for very long texts (splits recommended instead)

When to Adjust Parameters

# For very short texts
audio = tts.generate_speech(
    "Ce!",  # "Hello!"
    speaker_id=Bourama,
    max_new_audio_tokens=1024  # Shorter for efficiency
)

# For experimental/creative applications only
audio = tts.generate_speech(
    text,
    speaker_id=Bourama,
    temperature=0.7,  # Slightly more conservative
    top_k=40         # More focused vocabulary
)

Audio Output Options

Handle generated audio efficiently:

import numpy as np
import soundfile as sf

# Generate audio
audio = tts.generate_speech("Baara ka nɔgɔya", speaker_id=Bourama)

# Method 1: Direct file saving during generation
audio = tts.generate_speech(
    "Baara ka nɔgɔya", 
    speaker_id=Bourama,
    output_filename="work_improves.wav"
)

# Method 2: Save after generation
sf.write("output.wav", audio, 16000)

# Method 3: Audio analysis
print(f"Duration: {len(audio)/16000:.2f} seconds")
print(f"Sample rate: 16000 Hz")
print(f"Channels: 1 (mono)")

# Method 4: Audio processing
normalized_audio = audio / np.max(np.abs(audio))

Examples

Educational Applications

Create learning materials with consistent, high-quality pronunciation:

# Bambara language lessons
lessons = [
    "Dɔrɔn kelen: I ni ce - Musow kɛlɛ",        # "Lesson one: Hello - Greetings"
    "Dɔrɔn fila: Tɔgɔ - N tɔgɔ ye ... ye",     # "Lesson two: Names - My name is ..."
    "Dɔrɔn saba: Jamu - I jamu ye mun ye?",     # "Lesson three: Family - What is your family name?"
]

for i, lesson in enumerate(lessons, 1):
    audio = tts.generate_speech(
        lesson,
        speaker_id=Bourama,  # Consistent speaker for lessons
        output_filename=f"lesson_{i:02d}.wav"
    )
    print(f"Created lesson {i}")

Voice Interface Integration

Build Bambara-speaking applications:

# Voice assistant responses
responses = {
    "greeting": "I ni ce! I bɛ se ka n wele min na?",      # "Hello! What can I call you?"
    "confirmation": "Awɔ, n ye a faamu",                   # "Yes, I understand"
    "error": "Hakɛto, segin ka a fɔ",                     # "Sorry, please repeat"
    "goodbye": "Kan bɛn na! I ni che!",                   # "See you later! Goodbye!"
}

def generate_assistant_voices():
    for intent, response in responses.items():
        audio = tts.generate_speech(
            response,
            speaker_id=Bourama,
            output_filename=f"assistant_{intent}.wav"
        )
        print(f"Generated response for: {intent}")

generate_assistant_voices()

Batch Processing

Process multiple texts efficiently:

# News headlines in Bambara
news_items = [
    "Kalan yɔrɔ kura da Bamakɔ",                         # "New school opens in Bamako"
    "Sɛnɛkɛlaw ka baara ka ɲɛ",                          # "Farmers' work improves"
    "Fɛn kura bɛ na bamanankan na",                      # "New things come to Bambara"
]

def batch_process(texts, speaker=Bourama):
    results = []
    for i, text in enumerate(texts):
        try:
            audio = tts.generate_speech(
                text,
                speaker_id=speaker,
                output_filename=f"news_{i+1:02d}.wav"
            )
            results.append(f"news_{i+1:02d}.wav")
            print(f"Processed: {text}")
        except Exception as e:
            print(f"Error processing '{text}': {e}")
    return results

batch_process(news_items)

Best Practices

Speaker Selection

Use Bourama for production: Most stable and accurate results
Consistent speakers: Use the same speaker for related content
Test speakers: Try different voices for your specific use case

Text Preparation

# Good practices for input text
def prepare_bambara_text(text):
    # Remove extra whitespace
    text = " ".join(text.split())
    
    # Ensure proper sentence ending
    if not text.endswith(('.', '!', '?')):
        text += '.'
    
    return text

# Example usage
raw_text = "  I ni ce   "
clean_text = prepare_bambara_text(raw_text)  # "I ni ce."
audio = tts.generate_speech(clean_text, speaker_id=Bourama)

Long Text Handling

def synthesize_long_text(text, speaker_id=Bourama, max_length=100):
    """Split long texts into manageable chunks"""
    sentences = text.split('. ')
    audio_chunks = []
    
    for sentence in sentences:
        if sentence.strip():
            audio = tts.generate_speech(
                sentence.strip() + '.',
                speaker_id=speaker_id
            )
            audio_chunks.append(audio)
    
    # Combine all audio
    return np.concatenate(audio_chunks) if audio_chunks else np.array([])

# Example with long text
long_text = "Bamanankan ye kan ye min bɛ kuma Mali la. A bɛ kuma miliyɔn caman fɛ. A ka fɔcogo ka nɔgɔya don."
full_audio = synthesize_long_text(long_text)
sf.write("long_speech.wav", full_audio, 16000)

Performance Considerations

GPU Usage

import torch

# Check GPU availability
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name()}")
    print(f"Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")
else:
    print("Running on CPU (slower but functional)")

Memory Optimization

# For memory-constrained environments
def memory_efficient_synthesis(texts, speaker_id=Bourama):
    # Process one text at a time instead of batching
    for i, text in enumerate(texts):
        audio = tts.generate_speech(
            text,
            speaker_id=speaker_id,
            max_new_audio_tokens=1024,  # Shorter sequences
            output_filename=f"output_{i}.wav"
        )
        # Audio is saved, memory can be freed
        print(f"Processed text {i+1}/{len(texts)}")

Batch Efficiency

# Reuse the same TTS instance
tts = BambaraTTSInference()  # Initialize once

# Process multiple texts
texts = ["Text 1", "Text 2", "Text 3"]
for text in texts:
    audio = tts.generate_speech(text, speaker_id=Bourama)
    
# Don't create new instances unnecessarily

Troubleshooting

Common Issues and Solutions

Empty audio output:

# Check text content and length
text = "I ni ce"
if not text.strip():
    print("Error: Empty text provided")
elif len(text) < 3:
    print("Warning: Very short text may not generate audio")
else:
    audio = tts.generate_speech(text, speaker_id=Bourama)

CUDA memory errors:

# Reduce token limit for GPU memory issues
audio = tts.generate_speech(
    text,
    speaker_id=Bourama,
    max_new_audio_tokens=1024  # Reduced from default 2048
)

Speaker validation errors:

from maliba_ai.config.settings import SPEAKER_IDS

# Verify speaker availability
def validate_speaker(speaker):
    if speaker.id not in SPEAKER_IDS:
        print(f"Invalid speaker. Available: {SPEAKER_IDS}")
        return False
    return True

# Safe speaker usage
if validate_speaker(Bourama):
    audio = tts.generate_speech(text, speaker_id=Bourama)

Performance Issues

# Monitor generation time
import time

start_time = time.time()
audio = tts.generate_speech(text, speaker_id=Bourama)
generation_time = time.time() - start_time

print(f"Generated {len(audio)/16000:.2f}s audio in {generation_time:.2f}s")
print(f"Real-time factor: {generation_time/(len(audio)/16000):.2f}x")

Contributing

We welcome contributions to improve Bambara TTS! Here's how to get involved:

# Development setup
git clone https://github.com/MALIBA-AI/bambara-tts.git
cd bambara-tts
pip install -e ".[dev]"

# Run tests
pytest tests/

# Code formatting
ruff format .
isort .

Ways to contribute:

Training data: Help collect diverse Bambara speech samples
Bug reports: Report issues with detailed reproduction steps
Documentation: Improve guides and examples
Testing: Test on different hardware and use cases
Speaker diversity: Contribute additional voice recordings

Contribution guidelines:

Follow existing code style
Add tests for new features
Update documentation
Respect speaker privacy and consent

Support and Community

GitHub Issues: Report bugs and request features
Email: contact@maliba-ai.com
Website: maliba-ai.com

License

This project is released under the MIT License. See LICENSE for details.

Acknowledgments

Spark-TTS: Foundation architecture for neural speech synthesis
Bambara speakers: Community members who contributed voice data
MALIBA-AI team: Dedicated developers and researchers
Mali: Our inspiration for building inclusive technology

MALIBA-AI ka baara kɛ ka bamanankan lakana diɲɛ kɔnɔ! (MALIBA-AI works to preserve Bambara language in the world!)

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.0.0b0 pre-release

Jan 20, 2026

1.1.1b0 pre-release

Jul 3, 2025

1.0.1b0 pre-release

Jun 28, 2025

1.0.0b1 pre-release

Jun 28, 2025

This version

0.1.0

Jun 11, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

maliba_ai-0.1.0.tar.gz (47.1 kB view details)

Uploaded Jun 11, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

maliba_ai-0.1.0-py3-none-any.whl (55.3 kB view details)

Uploaded Jun 11, 2025 Python 3

File details

Details for the file maliba_ai-0.1.0.tar.gz.

File metadata

Download URL: maliba_ai-0.1.0.tar.gz
Upload date: Jun 11, 2025
Size: 47.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for maliba_ai-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`29f025f1f1979faa73452bf2ad73329b22037d0de8c57286f0c4acc049fb8346`
MD5	`a39005215de3ee7c26f7b8f746140c76`
BLAKE2b-256	`497ee47c988a519689a579d785f4d1fd559d6dd341ce6e10d718778ae4b1ed3a`

See more details on using hashes here.

File details

Details for the file maliba_ai-0.1.0-py3-none-any.whl.

File metadata

Download URL: maliba_ai-0.1.0-py3-none-any.whl
Upload date: Jun 11, 2025
Size: 55.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for maliba_ai-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a45f858eb97efedac46f7daa2dac9937ad26d954573f26c545f5002d22037815`
MD5	`66f953b61aab7d88c27a83cba1abbce1`
BLAKE2b-256	`10ca72315c6002e5d92881cdb3af04ec6e62f945b778231c03df6d99232d37a6`

See more details on using hashes here.

maliba-ai 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

MALIBA-AI Bambara TTS SDK

Table of Contents

About MALIBA-AI

Installation

Quick Start

Client Initialization

Basic Usage

Simple Text-to-Speech

Speaker Selection

Advanced Configuration

Speaker System

Generation Parameters

Default Configuration

Parameter Guidelines

When to Adjust Parameters

Audio Output Options

Examples

Educational Applications

Voice Interface Integration

Batch Processing

Best Practices

Speaker Selection

Text Preparation

Long Text Handling

Performance Considerations

GPU Usage

Memory Optimization

Batch Efficiency

Troubleshooting

Common Issues and Solutions

Performance Issues

Contributing

Support and Community

License

Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes