Advanced multi-agent voice conversation system with customizable personas and modes

These details have not been verified by PyPI

Project links

Project description

DuoTalk 🎭

Advanced Multi-Agent Voice Conversation System

DuoTalk is a comprehensive Python package for creating engaging multi-agent voice conversations with customizable personas, conversation modes, and easy integration capabilities. Built on top of LiveKit and Google Gemini, it provides a powerful yet simple API for generating dynamic conversations between AI agents with distinct personalities.

🌟 Features

🎭 Rich Persona Library: 14+ pre-defined personas (Optimist, Skeptic, Pragmatist, etc.)
🗣️ Multiple Conversation Modes: Debate, Roundtable, Interview, Panel, Socratic, and more
🎙️ Voice Integration: Full voice synthesis using Google Gemini's native audio
⚡ Easy Setup: Simple pip installation and intuitive API
🔧 Highly Customizable: Create custom personas, modes, and conversation flows
📊 Analytics: Built-in conversation metrics and performance tracking
🖥️ CLI Interface: Command-line tool for quick conversations
📝 Conversation Logging: Automatic conversation transcription and analysis
🎯 Multiple Use Cases: Education, brainstorming, testing, entertainment

🚀 Quick Start

Installation

pip install duotalk

Or using uv:

uv add duotalk

DuoTalk 🎭

Advanced Multi‑Agent Voice Conversation System

DuoTalk lets you create engaging conversations between AI agents with distinct personas across multiple modes (debate, roundtable, interview, and more). Use the Python API for full control or the CLI to get started in seconds. Optional YouTube summarization turns any video into a spoken, natural summary.

🌟 Highlights

🎭 Personas library: 14+ ready-to-use personas (optimist, skeptic, pragmatist, theorist, educator, scientist, artist, and more)
💬 Conversation modes: friendly, debate, roundtable, interview, panel, socratic
🧪 Quick start helpers: one‑liners like quick_debate() and quick_roundtable()
🧱 Builder API: fluent, composable setup with ConversationBuilder
🖥️ CLI: start demos, list personas/modes, summarize YouTube videos
🎬 YouTube summarizer: AI‑powered short or detailed summaries, optional voice
🔊 Voice ready: integrates with DuoTalk’s voice runner (LiveKit optional)
🧩 Typed APIs: shipped with py.typed for great editor/IDE support

🚀 Installation

pip install duotalk

Optional extras for YouTube summaries:

pip install yt-dlp requests google-generativeai

🔑 Environment

Create a .env (or export env vars) for optional features:

# For YouTube summarization with Google Gemini
GOOGLE_API_KEY=your_google_api_key   # or GEMINI_API_KEY

# For real-time voice via LiveKit (optional)
LIVEKIT_API_KEY=your_key
LIVEKIT_API_SECRET=your_secret

🧭 Quick Start (Python)

Use quick helpers for the fastest path:

import asyncio
from duotalk import quick_debate, quick_roundtable

async def main():
    # Debate mode (optimist vs skeptic by default)
    runner = quick_debate("Should AI replace human creativity?", max_turns=12, voice=False)
    await runner.start()

    # Roundtable with four personas
    runner = quick_roundtable("Future of renewable energy", max_turns=10, voice=False)
    await runner.start()

asyncio.run(main())

Prefer a fluent Builder:

from duotalk import conversation

runner = (conversation()
    .with_topic("Climate change solutions")
    .with_mode("roundtable")
    .with_personas("pragmatist", "theorist", "skeptic")
    .with_max_turns(10)
    .with_voice_enabled(False)  # set True when LiveKit voice is configured
    .build_and_start())

# Start the conversation
import asyncio
asyncio.run(runner.start())

🖥️ CLI Usage

The CLI bundles common workflows. Run duotalk --help for all options.

# Demo a conversation in the terminal
duotalk demo "Pineapple on pizza" --mode debate --max-turns 6

# Start (create config) for any mode
duotalk start "AI ethics in hiring" --mode roundtable --personas optimist,skeptic,analyst

# Presets
duotalk preset business "Quarterly planning"
duotalk preset academic "The role of peer review"
duotalk preset creative "Designing for delight"
duotalk preset policy "AI regulation roadmap"

# Explore available options
duotalk list-personas
duotalk list-modes

# Interactive builder
duotalk interactive

# YouTube summarization (short or detailed via prompt)
duotalk summarize "https://www.youtube.com/watch?v=VIDEO_ID" --voice --save

Commands provided by the CLI:

start – build a conversation config for a mode/personas
demo – run a text‑mode demo in the terminal
preset – business, academic, creative, policy, debate, roundtable, interview, panel
summarize – summarize a YouTube video (optional voice)
list-personas – list all persona names
list-modes – list all conversation modes
interactive – step‑by‑step guided setup

🎬 YouTube Summaries (Python)

Two options are available:

High‑level convenience

import asyncio
from duotalk.core.youtube_summarizer import summarize_youtube_video

async def main():
    result = await summarize_youtube_video(
        url="https://www.youtube.com/watch?v=VIDEO_ID",
        use_voice=False,
        summary_mode="detailed"  # or "short"
    )
    if result["success"]:
        print(result["summary"])  # natural, speech‑friendly text

asyncio.run(main())

duotalk summarize "https://www.youtube.com/watch?v=VIDEO_ID" --voice --save

Notes:

Requires yt-dlp and requests for transcript fetching.
Provide GOOGLE_API_KEY (or GEMINI_API_KEY) to enable AI summaries.
Voice playback is optional and depends on your voice setup.

� Personas

Available persona names include:

optimist, pessimist, pragmatist, theorist, skeptic, enthusiast, mediator, analyst, creative, logical thinker, educator, entrepreneur, scientist, artist

Pick any by name in the Builder, quick helpers, or CLI.

🧩 Conversation Modes

friendly – collaborative discussion
debate – structured argument with opposing viewpoints
roundtable – multi‑participant exchange
interview – interviewer with one or more interviewees
panel – moderator plus subject‑matter experts
socratic – question‑driven exploration

🔊 Voice

The package supports voice‑enabled runs via DuoTalk’s voice runner. You can work in demo (text) mode without any voice setup. To enable voice, configure your audio stack (e.g., LiveKit credentials) and set .with_voice_enabled(True) or pass voice=True to quick helpers. The CLI will indicate when a voice session is required.

� Python API Surface (at a glance)

Quick helpers: quick_debate, quick_roundtable, quick_friendly, quick_interview, quick_panel, quick_socratic, quick_start
Builder: ConversationBuilder and conversation()
Convenience creators: create_debate, create_roundtable, create_friendly_chat, create_interview, create_panel, create_socratic, create_random_conversation, presets (business/academic/creative/policy)
YouTube: summarize_youtube_video, validate_youtube_url, extract_video_id

🐍 Requirements

Python 3.8+
Optional: yt-dlp, requests, google-generativeai for YouTube summaries
Optional: voice runtime (e.g., LiveKit) if you enable audio

📄 License

MIT – see LICENSE.

—

Build dynamic agent conversations, fast. If you have ideas for new personas or modes, PRs and issues are welcome.

Exponential backoff retry logic for YouTube API rate limits (429 errors)
Progressive wait times with intelligent retry strategies
Comprehensive logging for debugging and monitoring
Graceful degradation when services are temporarily unavailable

🐛 Bug Fixes

Fixed single agent chat mode - no more unwanted multi-agent conversations
Improved error handling across all conversation types
Enhanced session management for better stability

🏗️ Code Architecture

📋 Requirements

Prerequisites for running DuoTalk

🐍 Python 3.8+
🔗 LiveKit Agents SDK
🧠 Google Gemini API

🚀 Quick Setup

1️⃣ Clone & Navigate

git clone https://github.com/AbhyudayPatel/DuoTalk.git
cd DuoTalk

2️⃣ Install Dependencies

python -m venv venv
venv\Scripts\activate
pip install -r requirements.txt

3️⃣ Environment Configuration

Create a .env file in your project root:

# Add your Google Gemini API key
GOOGLE_API_KEY=your_gemini_api_key_here

💡 Tip: Get your API key from Google AI Studio

🎮 Usage

🏃‍♂️ Starting DuoTalk

# For 2 agents (friendly/discussion/debate):
python dual_voice_agents.py console
# For 4 agents (roundtable/friendly/debate):
python four_agents_duotalk.py console

📝 Interactive Setup

Step 1: 🎯 Choose Your Topic

Enter the topic for the conversation: _

Examples:

The future of AI and robotics
Climate change solutions
Space exploration and Mars colonization
The ethics of genetic engineering

Step 2: 🎭 Select Conversation Mode

Select conversation mode:
1. Friendly discussion (2 agents)
2. Debate format (2 agents)
3. Roundtable discussion (4 agents)
Enter your choice (1, 2, or 3): _

Mode	🤝 Friendly Discussion	⚔️ Debate Format	🌀 Roundtable
Style	Collaborative & supportive	Opposing viewpoints	Diverse perspectives
Tone	Encouraging dialogue	Direct & contrary	Dynamic & engaging
Personas	Agent1 & Agent2	Optimist vs Skeptic	Optimist, Skeptic, Pragmatist, Theorist
Voices	Puck & Charon	Puck & Charon	Puck & Charon (multiple roles)

⚙️ Configuration

🔧 Customization Options

Setting	Default	How to Change
🔄 Max Turns	12 turns	Modify `max_turns` in `ConversationState`
🎤 Agent Voices	Puck & Charon	Update voice parameters in code
🤖 AI Model	`gemini-2.5-flash-preview-native-audio-dialog`	Change model string
💬 Response Length	One-line responses	Modify instructions in `DualPersonaAgent`

🧩 Core Components

Component	🎯 Purpose
`ConversationState`	📊 Manages conversation state and settings
`DualPersonaAgent`	🎭 Main agent class with dual persona support
`get_conversation_mode()`	📝 Handles user input for conversation mode
`run_friendly_conversation()`	🤝 Manages friendly discussion flow
`run_debate_conversation()`	⚔️ Manages debate flow with optimist/skeptic roles
`safe_generate_reply()`	🛡️ Handles responses with error handling and retries

🛡️ Error Handling & Reliability

DuoTalk is built with enterprise-grade reliability:

🔍 Comprehensive Error Management

Feature	Description
📊 Session Health Monitoring	Real-time health checks
🔄 Automatic Retries	Smart retry logic for failed responses
🧹 Graceful Cleanup	Proper resource management
📝 Detailed Logging	Comprehensive debugging information
⏱️ Timeout Protection	Prevents hanging operations
🔧 Recovery Mechanisms	Automatic error recovery

📄 License

MIT License - See LICENSE file for details

Experience the future of AI conversation today!

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.0.7

Sep 18, 2025

This version

1.0.6

Sep 16, 2025

1.0.5

Sep 15, 2025

1.0.4

Aug 22, 2025

1.0.3

Aug 22, 2025

1.0.2

Aug 22, 2025

1.0.1

Aug 21, 2025

1.0.0

Aug 21, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

duotalk-1.0.6.tar.gz (80.1 kB view details)

Uploaded Sep 16, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

duotalk-1.0.6-py3-none-any.whl (91.4 kB view details)

Uploaded Sep 16, 2025 Python 3

File details

Details for the file duotalk-1.0.6.tar.gz.

File metadata

Download URL: duotalk-1.0.6.tar.gz
Upload date: Sep 16, 2025
Size: 80.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for duotalk-1.0.6.tar.gz
Algorithm	Hash digest
SHA256	`b7870a06482d97f3b521bc4e53dc8c04a3249c47a97766a641f7ab54637aa03e`
MD5	`c929df389dcb1afa3c1da3707113ed03`
BLAKE2b-256	`8e497646d9ea414d2b3e8ae09805cc4e3d9e10117b1f96470a66736d875bf4b8`

See more details on using hashes here.

File details

Details for the file duotalk-1.0.6-py3-none-any.whl.

File metadata

Download URL: duotalk-1.0.6-py3-none-any.whl
Upload date: Sep 16, 2025
Size: 91.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for duotalk-1.0.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`dd346cffed37bfdb7ece2df90727f6dd4c1aa82c229bc79d0894b29861d9e92e`
MD5	`acd2aaaa9d2011b085bdf7a2aa7fdd5f`
BLAKE2b-256	`8ba976fdf9f210ce1f8395a6b2bf106d337e75fdac6c853eac7c66312a0963ba`

See more details on using hashes here.

duotalk 1.0.6

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

DuoTalk 🎭

🌟 Features

🚀 Quick Start

Installation

DuoTalk 🎭

🌟 Highlights

🚀 Installation

🔑 Environment

🧭 Quick Start (Python)

🖥️ CLI Usage

🎬 YouTube Summaries (Python)

� Personas

🧩 Conversation Modes

🔊 Voice

� Python API Surface (at a glance)

🐍 Requirements

📄 License

🐛 Bug Fixes

🏗️ Code Architecture

📋 Requirements

🚀 Quick Setup

1️⃣ Clone & Navigate

2️⃣ Install Dependencies

3️⃣ Environment Configuration

🎮 Usage

🏃‍♂️ Starting DuoTalk

📝 Interactive Setup

Step 1: 🎯 Choose Your Topic

Step 2: 🎭 Select Conversation Mode

⚙️ Configuration

🧩 Core Components

🛡️ Error Handling & Reliability

📄 License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes