Advanced multi-agent voice conversation system with customizable personas and modes
Project description
DuoTalk ๐ญ
Advanced Multi-Agent Voice Conversation System
DuoTalk is a comprehensive Python package for creating engaging multi-agent voice conversations with customizable personas, conversation modes, and easy integration capabilities. Built on top of LiveKit and Google Gemini, it provides a powerful yet simple API for generating dynamic conversations between AI agents with distinct personalities.
๐ Features
- ๐ญ Rich Persona Library: 14+ pre-defined personas (Optimist, Skeptic, Pragmatist, etc.)
- ๐ฃ๏ธ Multiple Conversation Modes: Debate, Roundtable, Interview, Panel, Socratic, and more
- ๐๏ธ Voice Integration: Full voice synthesis using Google Gemini's native audio
- โก Easy Setup: Simple pip installation and intuitive API
- ๐ง Highly Customizable: Create custom personas, modes, and conversation flows
- ๐ Analytics: Built-in conversation metrics and performance tracking
- ๐ฅ๏ธ CLI Interface: Command-line tool for quick conversations
- ๐ Conversation Logging: Automatic conversation transcription and analysis
- ๐ฏ Multiple Use Cases: Education, brainstorming, testing, entertainment
๐ Quick Start
Installation
pip install duotalk
Or using uv:
uv add duotalk
Environment Setup
Create a .env file with your API keys:
GEMINI_API_KEY=your_gemini_api_key_here
# Optional for production LiveKit usage
LIVEKIT_API_KEY=your_livekit_key
LIVEKIT_API_SECRET=your_livekit_secret
Basic Usage
import asyncio
from duotalk import create_debate
async def main():
# Create a debate conversation
runner = await create_debate(
topic="Should AI replace human teachers?",
max_turns=10
)
# Start the conversation
await runner.start()
asyncio.run(main())
Using the CLI
# Start a debate conversation
duotalk debate -t "Pineapple on pizza"
# Friendly chat with custom personas
duotalk chat -t "Weekend plans" -p optimist,enthusiast
# Roundtable discussion with 4 participants
duotalk roundtable -t "Climate change solutions" -a 4
# Interview format
duotalk interview -t "AI ethics" --interviewer journalist --interviewee expert
# Expert panel discussion
duotalk panel -t "Space exploration" -a 5
# List available personas
duotalk personas
# Use different LiveKit modes
duotalk debate -t "Remote work" --mode dev # Development mode
duotalk debate -t "Remote work" --mode console # Console mode (default)
duotalk debate -t "Remote work" --mode start # Production mode
# Get help
duotalk --help
duotalk debate --help
๐ Overview
DuoTalk brings AI conversations to life with two distinct AI audio agents powered by Google Gemini-realtime and LiveKit. Watch as they engage in real-time voice conversations, whether collaborating in friendly discussions or debating opposing viewpoints on any topic you choose.
โจ Features
| Feature | Description |
|---|---|
| ๐ญ Dual & Quad AI Voice Agents | Two or four agents with unique voices and personas |
| ๐ฌ Conversation Modes | Choose between friendly discussion, debate, or roundtable format |
| ๐ Roundtable Feature | Four agents share diverse perspectives in a dynamic roundtable |
| ๐ฏ Custom Topics | Specify any topic for dynamic conversations |
| ๐ Real-Time Audio | Natural spoken dialogue using Gemini's latest models |
| ๐ก๏ธ Robust Error Handling | Smart retry logic and graceful error recovery |
| ๐ช Voice Personas | Distinct voices: Puck (optimist, pragmatist) & Charon (skeptic, theorist) |
| ๐ฌ YouTube Integration | AI-powered YouTube video summarization with voice playback |
| ๐ Rate Limiting Resilience | Exponential backoff retry logic for API rate limits |
| ๐ต Enhanced TTS | Robust text-to-speech with multiple fallback mechanisms |
๐ Recent Improvements (v1.0.5)
๐ฌ AI-Powered YouTube Summarization
- Google Gemini 2.5 Flash Lite integration for intelligent video analysis
- Automatic transcript extraction with robust error handling
- Customizable summary depth (short/detailed) based on user preference
- Voice narration of summaries using enhanced TTS system
๐ Enhanced Voice System
- Robust TTS engine with comprehensive error handling
- pygame audio initialization checks and fallback mechanisms
- Windows-specific audio optimizations for better compatibility
- Proper resource cleanup to prevent audio system conflicts
๐ก๏ธ Rate Limiting & Resilience
- Exponential backoff retry logic for YouTube API rate limits (429 errors)
- Progressive wait times with intelligent retry strategies
- Comprehensive logging for debugging and monitoring
- Graceful degradation when services are temporarily unavailable
๐ Bug Fixes
- Fixed single agent chat mode - no more unwanted multi-agent conversations
- Improved error handling across all conversation types
- Enhanced session management for better stability
๐๏ธ Code Architecture
๐ Requirements
Prerequisites for running DuoTalk
- ๐ Python 3.8+
- ๐ LiveKit Agents SDK
- ๐ง Google Gemini API
๐ Quick Setup
1๏ธโฃ Clone & Navigate
git clone https://github.com/AbhyudayPatel/DuoTalk.git
cd DuoTalk
2๏ธโฃ Install Dependencies
python -m venv venv
venv\Scripts\activate
pip install -r requirements.txt
3๏ธโฃ Environment Configuration
Create a .env file in your project root:
# Add your Google Gemini API key
GOOGLE_API_KEY=your_gemini_api_key_here
๐ก Tip: Get your API key from Google AI Studio
๐ฎ Usage
๐โโ๏ธ Starting DuoTalk
# For 2 agents (friendly/discussion/debate):
python dual_voice_agents.py console
# For 4 agents (roundtable/friendly/debate):
python four_agents_duotalk.py console
๐ Interactive Setup
Step 1: ๐ฏ Choose Your Topic
Enter the topic for the conversation: _
Examples:
The future of AI and roboticsClimate change solutionsSpace exploration and Mars colonizationThe ethics of genetic engineering
Step 2: ๐ญ Select Conversation Mode
Select conversation mode:
1. Friendly discussion (2 agents)
2. Debate format (2 agents)
3. Roundtable discussion (4 agents)
Enter your choice (1, 2, or 3): _
| Mode | ๐ค Friendly Discussion | โ๏ธ Debate Format | ๐ Roundtable |
|---|---|---|---|
| Style | Collaborative & supportive | Opposing viewpoints | Diverse perspectives |
| Tone | Encouraging dialogue | Direct & contrary | Dynamic & engaging |
| Personas | Agent1 & Agent2 | Optimist vs Skeptic | Optimist, Skeptic, Pragmatist, Theorist |
| Voices | Puck & Charon | Puck & Charon | Puck & Charon (multiple roles) |
โ๏ธ Configuration
๐ง Customization Options
| Setting | Default | How to Change |
|---|---|---|
| ๐ Max Turns | 12 turns | Modify max_turns in ConversationState |
| ๐ค Agent Voices | Puck & Charon | Update voice parameters in code |
| ๐ค AI Model | gemini-2.5-flash-preview-native-audio-dialog |
Change model string |
| ๐ฌ Response Length | One-line responses | Modify instructions in DualPersonaAgent |
๐งฉ Core Components
| Component | ๐ฏ Purpose |
|---|---|
ConversationState |
๐ Manages conversation state and settings |
DualPersonaAgent |
๐ญ Main agent class with dual persona support |
get_conversation_mode() |
๐ Handles user input for conversation mode |
run_friendly_conversation() |
๐ค Manages friendly discussion flow |
run_debate_conversation() |
โ๏ธ Manages debate flow with optimist/skeptic roles |
safe_generate_reply() |
๐ก๏ธ Handles responses with error handling and retries |
๐ก๏ธ Error Handling & Reliability
DuoTalk is built with enterprise-grade reliability:
๐ Comprehensive Error Management
| Feature | Description |
|---|---|
| ๐ Session Health Monitoring | Real-time health checks |
| ๐ Automatic Retries | Smart retry logic for failed responses |
| ๐งน Graceful Cleanup | Proper resource management |
| ๐ Detailed Logging | Comprehensive debugging information |
| โฑ๏ธ Timeout Protection | Prevents hanging operations |
| ๐ง Recovery Mechanisms | Automatic error recovery |
๐ License
MIT License - See LICENSE file for details
Experience the future of AI conversation today!
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file duotalk-1.0.5.tar.gz.
File metadata
- Download URL: duotalk-1.0.5.tar.gz
- Upload date:
- Size: 77.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bccb04ac11ed6797fa145a31efba93bd5e1199595e9917303bacd15bda3ce231
|
|
| MD5 |
d290a7ec771a7d7f5a4f11b25349e6cd
|
|
| BLAKE2b-256 |
133900635222788cf42dc89046ffe26dcc0868ac4d538649313b6d8ec3863304
|
File details
Details for the file duotalk-1.0.5-py3-none-any.whl.
File metadata
- Download URL: duotalk-1.0.5-py3-none-any.whl
- Upload date:
- Size: 89.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a5af27a48a06ba31854e1d78121620734f0ca2c91f5c9e1b2966a81afc4dad2e
|
|
| MD5 |
94ef38700529b7b2cd8adfb76df0bf87
|
|
| BLAKE2b-256 |
997c5505f20b845fc339c58a2aed922208847805d5f3b6e5280aaacf5ee42b65
|