Advanced multi-agent voice conversation system with customizable personas and modes
Project description
DuoTalk 🎭
Advanced Multi-Agent Voice Conversation System
DuoTalk is a comprehensive Python package for creating engaging multi-agent voice conversations with customizable personas, conversation modes, and easy integration capabilities. Built on top of LiveKit and Google Gemini, it provides a powerful yet simple API for generating dynamic conversations between AI agents with distinct personalities.
🌟 Features
- 🎭 Rich Persona Library: 14+ pre-defined personas (Optimist, Skeptic, Pragmatist, etc.)
- 🗣️ Multiple Conversation Modes: Debate, Roundtable, Interview, Panel, Socratic, and more
- 🎙️ Voice Integration: Full voice synthesis using Google Gemini's native audio
- ⚡ Easy Setup: Simple pip installation and intuitive API
- 🔧 Highly Customizable: Create custom personas, modes, and conversation flows
- 📊 Analytics: Built-in conversation metrics and performance tracking
- 🖥️ CLI Interface: Command-line tool for quick conversations
- 📝 Conversation Logging: Automatic conversation transcription and analysis
- 🎯 Multiple Use Cases: Education, brainstorming, testing, entertainment
🚀 Quick Start
Installation
pip install duotalk
Or using uv:
uv add duotalk
DuoTalk 🎭
Advanced Multi‑Agent Voice Conversation System
DuoTalk lets you create engaging conversations between AI agents with distinct personas across multiple modes (debate, roundtable, interview, and more). Use the Python API for full control or the CLI to get started in seconds. Optional YouTube summarization turns any video into a spoken, natural summary.
🌟 Highlights
- 🎭 Personas library: 14+ ready-to-use personas (optimist, skeptic, pragmatist, theorist, educator, scientist, artist, and more)
- 💬 Conversation modes: friendly, debate, roundtable, interview, panel, socratic
- 🧪 Quick start helpers: one‑liners like
quick_debate()andquick_roundtable() - 🧱 Builder API: fluent, composable setup with
ConversationBuilder - 🖥️ CLI: start demos, list personas/modes, summarize YouTube videos
- 🎬 YouTube summarizer: AI‑powered short or detailed summaries, optional voice
- 🔊 Voice ready: integrates with DuoTalk’s voice runner (LiveKit optional)
- 🧩 Typed APIs: shipped with
py.typedfor great editor/IDE support
🚀 Installation
pip install duotalk
Optional extras for YouTube summaries:
pip install yt-dlp requests google-generativeai
🔑 Environment
Create a .env (or export env vars) for optional features:
# For YouTube summarization with Google Gemini
GOOGLE_API_KEY=your_google_api_key # or GEMINI_API_KEY
# For real-time voice via LiveKit (optional)
LIVEKIT_API_KEY=your_key
LIVEKIT_API_SECRET=your_secret
🧭 Quick Start (Python)
Use quick helpers for the fastest path:
import asyncio
from duotalk import quick_debate, quick_roundtable
async def main():
# Debate mode (optimist vs skeptic by default)
runner = quick_debate("Should AI replace human creativity?", max_turns=12, voice=False)
await runner.start()
# Roundtable with four personas
runner = quick_roundtable("Future of renewable energy", max_turns=10, voice=False)
await runner.start()
asyncio.run(main())
Prefer a fluent Builder:
from duotalk import conversation
runner = (conversation()
.with_topic("Climate change solutions")
.with_mode("roundtable")
.with_personas("pragmatist", "theorist", "skeptic")
.with_max_turns(10)
.with_voice_enabled(False) # set True when LiveKit voice is configured
.build_and_start())
# Start the conversation
import asyncio
asyncio.run(runner.start())
🖥️ CLI Usage
The CLI bundles common workflows. Run duotalk --help for all options.
# Demo a conversation in the terminal
duotalk demo "Pineapple on pizza" --mode debate --max-turns 6
# Start (create config) for any mode
duotalk start "AI ethics in hiring" --mode roundtable --personas optimist,skeptic,analyst
# Presets
duotalk preset business "Quarterly planning"
duotalk preset academic "The role of peer review"
duotalk preset creative "Designing for delight"
duotalk preset policy "AI regulation roadmap"
# Explore available options
duotalk list-personas
duotalk list-modes
# Interactive builder
duotalk interactive
# YouTube summarization (short or detailed via prompt)
duotalk summarize "https://www.youtube.com/watch?v=VIDEO_ID" --voice --save
Commands provided by the CLI:
start– build a conversation config for a mode/personasdemo– run a text‑mode demo in the terminalpreset– business, academic, creative, policy, debate, roundtable, interview, panelsummarize– summarize a YouTube video (optional voice)list-personas– list all persona nameslist-modes– list all conversation modesinteractive– step‑by‑step guided setup
🎬 YouTube Summaries (Python)
Two options are available:
- High‑level convenience
import asyncio
from duotalk.core.youtube_summarizer import summarize_youtube_video
async def main():
result = await summarize_youtube_video(
url="https://www.youtube.com/watch?v=VIDEO_ID",
use_voice=False,
summary_mode="detailed" # or "short"
)
if result["success"]:
print(result["summary"]) # natural, speech‑friendly text
asyncio.run(main())
- CLI
duotalk summarize "https://www.youtube.com/watch?v=VIDEO_ID" --voice --save
Notes:
- Requires
yt-dlpandrequestsfor transcript fetching. - Provide
GOOGLE_API_KEY(orGEMINI_API_KEY) to enable AI summaries. - Voice playback is optional and depends on your voice setup.
� Personas
Available persona names include:
optimist, pessimist, pragmatist, theorist, skeptic, enthusiast, mediator, analyst, creative, logical thinker, educator, entrepreneur, scientist, artist
Pick any by name in the Builder, quick helpers, or CLI.
🧩 Conversation Modes
friendly– collaborative discussiondebate– structured argument with opposing viewpointsroundtable– multi‑participant exchangeinterview– interviewer with one or more intervieweespanel– moderator plus subject‑matter expertssocratic– question‑driven exploration
🔊 Voice
The package supports voice‑enabled runs via DuoTalk’s voice runner. You can work in demo (text) mode without any voice setup. To enable voice, configure your audio stack (e.g., LiveKit credentials) and set .with_voice_enabled(True) or pass voice=True to quick helpers. The CLI will indicate when a voice session is required.
� Python API Surface (at a glance)
- Quick helpers:
quick_debate,quick_roundtable,quick_friendly,quick_interview,quick_panel,quick_socratic,quick_start - Builder:
ConversationBuilderandconversation() - Convenience creators:
create_debate,create_roundtable,create_friendly_chat,create_interview,create_panel,create_socratic,create_random_conversation, presets (business/academic/creative/policy) - YouTube:
summarize_youtube_video,validate_youtube_url,extract_video_id
🐍 Requirements
- Python 3.8+
- Optional:
yt-dlp,requests,google-generativeaifor YouTube summaries - Optional: voice runtime (e.g., LiveKit) if you enable audio
📄 License
MIT – see LICENSE.
—
Build dynamic agent conversations, fast. If you have ideas for new personas or modes, PRs and issues are welcome.
- Exponential backoff retry logic for YouTube API rate limits (429 errors)
- Progressive wait times with intelligent retry strategies
- Comprehensive logging for debugging and monitoring
- Graceful degradation when services are temporarily unavailable
🐛 Bug Fixes
- Fixed single agent chat mode - no more unwanted multi-agent conversations
- Improved error handling across all conversation types
- Enhanced session management for better stability
🏗️ Code Architecture
📋 Requirements
Prerequisites for running DuoTalk
- 🐍 Python 3.8+
- 🔗 LiveKit Agents SDK
- 🧠 Google Gemini API
🚀 Quick Setup
1️⃣ Clone & Navigate
git clone https://github.com/AbhyudayPatel/DuoTalk.git
cd DuoTalk
2️⃣ Install Dependencies
python -m venv venv
venv\Scripts\activate
pip install -r requirements.txt
3️⃣ Environment Configuration
Create a .env file in your project root:
# Add your Google Gemini API key
GOOGLE_API_KEY=your_gemini_api_key_here
💡 Tip: Get your API key from Google AI Studio
🎮 Usage
🏃♂️ Starting DuoTalk
# For 2 agents (friendly/discussion/debate):
python dual_voice_agents.py console
# For 4 agents (roundtable/friendly/debate):
python four_agents_duotalk.py console
📝 Interactive Setup
Step 1: 🎯 Choose Your Topic
Enter the topic for the conversation: _
Examples:
The future of AI and roboticsClimate change solutionsSpace exploration and Mars colonizationThe ethics of genetic engineering
Step 2: 🎭 Select Conversation Mode
Select conversation mode:
1. Friendly discussion (2 agents)
2. Debate format (2 agents)
3. Roundtable discussion (4 agents)
Enter your choice (1, 2, or 3): _
| Mode | 🤝 Friendly Discussion | ⚔️ Debate Format | 🌀 Roundtable |
|---|---|---|---|
| Style | Collaborative & supportive | Opposing viewpoints | Diverse perspectives |
| Tone | Encouraging dialogue | Direct & contrary | Dynamic & engaging |
| Personas | Agent1 & Agent2 | Optimist vs Skeptic | Optimist, Skeptic, Pragmatist, Theorist |
| Voices | Puck & Charon | Puck & Charon | Puck & Charon (multiple roles) |
⚙️ Configuration
🔧 Customization Options
| Setting | Default | How to Change |
|---|---|---|
| 🔄 Max Turns | 12 turns | Modify max_turns in ConversationState |
| 🎤 Agent Voices | Puck & Charon | Update voice parameters in code |
| 🤖 AI Model | gemini-2.5-flash-preview-native-audio-dialog |
Change model string |
| 💬 Response Length | One-line responses | Modify instructions in DualPersonaAgent |
🧩 Core Components
| Component | 🎯 Purpose |
|---|---|
ConversationState |
📊 Manages conversation state and settings |
DualPersonaAgent |
🎭 Main agent class with dual persona support |
get_conversation_mode() |
📝 Handles user input for conversation mode |
run_friendly_conversation() |
🤝 Manages friendly discussion flow |
run_debate_conversation() |
⚔️ Manages debate flow with optimist/skeptic roles |
safe_generate_reply() |
🛡️ Handles responses with error handling and retries |
🛡️ Error Handling & Reliability
DuoTalk is built with enterprise-grade reliability:
🔍 Comprehensive Error Management
| Feature | Description |
|---|---|
| 📊 Session Health Monitoring | Real-time health checks |
| 🔄 Automatic Retries | Smart retry logic for failed responses |
| 🧹 Graceful Cleanup | Proper resource management |
| 📝 Detailed Logging | Comprehensive debugging information |
| ⏱️ Timeout Protection | Prevents hanging operations |
| 🔧 Recovery Mechanisms | Automatic error recovery |
📄 License
MIT License - See LICENSE file for details
Experience the future of AI conversation today!
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file duotalk-1.0.6.tar.gz.
File metadata
- Download URL: duotalk-1.0.6.tar.gz
- Upload date:
- Size: 80.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b7870a06482d97f3b521bc4e53dc8c04a3249c47a97766a641f7ab54637aa03e
|
|
| MD5 |
c929df389dcb1afa3c1da3707113ed03
|
|
| BLAKE2b-256 |
8e497646d9ea414d2b3e8ae09805cc4e3d9e10117b1f96470a66736d875bf4b8
|
File details
Details for the file duotalk-1.0.6-py3-none-any.whl.
File metadata
- Download URL: duotalk-1.0.6-py3-none-any.whl
- Upload date:
- Size: 91.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dd346cffed37bfdb7ece2df90727f6dd4c1aa82c229bc79d0894b29861d9e92e
|
|
| MD5 |
acd2aaaa9d2011b085bdf7a2aa7fdd5f
|
|
| BLAKE2b-256 |
8ba976fdf9f210ce1f8395a6b2bf106d337e75fdac6c853eac7c66312a0963ba
|