LiveKit Agents Plugin for services from AWS
Project description
AWS Plugin for LiveKit Agents
Complete AWS AI integration for LiveKit Agents, including Bedrock, Polly, Transcribe, and realtime speech-to-speech support for Amazon Nova Sonic
What's included:
- RealtimeModel - Amazon Nova 2 Sonic and Nova Sonic 1.0 for speech-to-speech
- LLM - Powered by Amazon Bedrock, defaults to Nova 2 Lite
- STT - Powered by Amazon Transcribe
- TTS - Powered by Amazon Polly
See https://docs.livekit.io/agents/integrations/aws/ for more information.
⚠️ Breaking Change
Default model changed to Nova 2 Sonic: RealtimeModel() now defaults to amazon.nova-2-sonic-v1:0 with modalities="mixed" (was amazon.nova-sonic-v1:0 with modalities="audio").
If you need the previous behavior, explicitly specify Nova Sonic 1.0:
model = aws.realtime.RealtimeModel.with_nova_sonic_1()
# or
model = aws.realtime.RealtimeModel(
model="amazon.nova-sonic-v1:0",
modalities="audio"
)
Installation
pip install livekit-plugins-aws
# For Nova Sonic realtime models
pip install livekit-plugins-aws[realtime]
Prerequisites
AWS Credentials
You'll need AWS credentials with access to Amazon Bedrock. Set them as environment variables:
export AWS_ACCESS_KEY_ID=<your-access-key>
export AWS_SECRET_ACCESS_KEY=<your-secret-key>
export AWS_DEFAULT_REGION=us-east-1 # or your preferred region
Getting Temporary Credentials from SSO (Local Testing)
If you use AWS SSO for authentication, get temporary credentials for local testing:
# Login to your SSO profile
aws sso login --profile your-profile-name
# Export credentials from your SSO session
eval $(aws configure export-credentials --profile your-profile-name --format env)
# Verify credentials are set
aws sts get-caller-identity
Alternatively, add this to your shell profile for automatic credential export:
# Add to ~/.bashrc or ~/.zshrc
function aws-creds() {
eval $(aws configure export-credentials --profile $1 --format env)
}
# Usage: aws-creds your-profile-name
Quick Start Example
The realtime_joke_teller.py example demonstrates both realtime and pipeline modes:
Demonstrates Both Modes
- Realtime mode: Nova 2 Sonic for end-to-end speech-to-speech
- Pipeline mode: Amazon Transcribe + Nova 2 Lite + Amazon Polly
Demonstrates Nova 2 Sonic Capabilities
- Text prompting: Agent greets users first using
generate_reply() - Multilingual support: Automatic language detection and response in 7 languages
- Multiple voices: 18 expressive voices across languages
- Function calling: Weather lookup, web search, and joke telling
Setup
-
Install dependencies:
pip install livekit-plugins-aws[realtime] \ livekit-plugins-silero \ jokeapi \ duckduckgo-search \ python-weather \ python-dotenv
-
Copy the example locally:
curl -O https://raw.githubusercontent.com/livekit/agents/main/examples/voice_agents/realtime_joke_teller.py
-
Set up environment variables:
# Create .env file echo "AWS_DEFAULT_REGION=us-east-1" > .env # Add your AWS credentials (see Prerequisites above)
-
(Optional) Run local LiveKit server:
For testing without LiveKit Cloud, run a local server:
# Install LiveKit server brew install livekit # macOS # or download from https://github.com/livekit/livekit/releases # Run in dev mode livekit-server --dev
Add to your
.envfile:LIVEKIT_URL=wss://127.0.0.1:7880 LIVEKIT_API_KEY=devkey LIVEKIT_API_SECRET=secret
See self-hosting documentation for more details.
Running the Example
Realtime Mode (Nova 2 Sonic) - Recommended for testing:
python realtime_joke_teller.py console
This runs locally using your computer's speakers and microphone. Use a headset to prevent echo.
Multilingual Support: Nova 2 Sonic automatically detects and responds in your language. Just start speaking in your preferred language (English, French, Italian, German, Spanish, Portuguese, or Hindi) and Nova 2 Sonic will respond in the same language!
Pipeline Mode (Transcribe + Nova Lite + Polly):
python realtime_joke_teller.py console --mode pipeline
Dev Mode (connect to LiveKit room for remote testing):
python realtime_joke_teller.py dev
# or
python realtime_joke_teller.py dev --mode pipeline
Try asking:
- "What's the weather in Seattle?"
- "Tell me a programming joke"
- "Search for information about my favorite movie, Short Circuit"
Features
Nova 2 Sonic Capabilities
Amazon Nova 2 Sonic is a unified speech-to-speech foundation model that delivers:
- Realtime bidirectional streaming - Low-latency, natural conversations
- Multilingual support - English, French, Italian, German, Spanish, Portuguese, and Hindi
- Automatic language mirroring - Responds in the user's spoken language
- Polyglot voices - Matthew and Tiffany can seamlessly switch between languages within a single conversation, ideal for multilingual applications
- 18 expressive voices - Multiple voices per language with natural prosody
- Function calling - Built-in tool use and agentic workflows
- Interruption handling - Graceful handling without losing context
- Noise robustness - Works in real-world environments
- Text input support - Programmatic text prompting
Model Selection
from livekit.plugins import aws
# Nova 2 Sonic (audio + text input, latest)
model = aws.realtime.RealtimeModel.with_nova_sonic_2()
# Nova Sonic 1.0 (audio-only, original model)
model = aws.realtime.RealtimeModel.with_nova_sonic_1()
Voice Selection
Voices are specified as lowercase strings. Import SONIC1_VOICES or SONIC2_VOICES type hints for IDE autocomplete.
from livekit.plugins.aws.experimental.realtime import SONIC2_VOICES
model = aws.realtime.RealtimeModel.with_nova_sonic_2(
voice="carolina" # Portuguese, feminine
)
Nova 2 Sonic Voice IDs (18 voices)
See official documentation for most up-to-date list and IDs.
- English (US):
tiffany(polyglot),matthew(polyglot) - English (UK):
amy - English (Australia):
olivia - English (India):
kiara,arjun - French:
ambre,florian - Italian:
beatrice,lorenzo - German:
tina,lennart - Spanish (US):
lupe,carlos - Portuguese (Brazil):
carolina,leo - Hindi:
kiara,arjun
Note: Tiffany abd Matthew in Nova 2 Sonic support polyglot mode, seamlessly switching between languages within a single conversation.
Nova Sonic 1.0 Voice IDs (11 voices)
See official documentation for most up-to-date list and IDs.
- English (US):
tiffany,matthew - English (UK):
amy - French:
ambre,florian - Italian:
beatrice,lorenzo - German:
greta,lennart - Spanish:
lupe,carlos
Text Prompting with generate_reply()
Nova 2 Sonic supports programmatic text input. This can be used to trigger agent responses or to mix speech and text input within a UI in the same conversation:
class Assistant(Agent):
async def on_enter(self):
# Make the agent speak first with a greeting
await self.session.generate_reply(
instructions="Greet the user and introduce your capabilities"
)
instructions vs user_input
The generate_reply() method accepts two parameters with different behaviors:
instructions - System-level commands (recommended):
await session.generate_reply(
instructions="Greet the user warmly and ask how you can help"
)
- Sent as a system prompt/command to the model
- Triggers immediate generation
- Does not appear in conversation history as user message
- Use for: Agent-initiated speech, prompting specific behaviors
user_input - Simulated user messages:
await session.generate_reply(
user_input="Hello, I need help with my account"
)
- Sent as interactive USER role content
- Added to Nova's conversation context
- Triggers generation as if user spoke
- Use for: Testing, simulating user input, programmatic conversations
When to use each:
- Agent greetings: Use
instructions- agent should speak without user input - Guided responses: Use
instructions- direct the agent's next action - Simulated conversations: Use
user_input- test multi-turn dialogs - Programmatic user input: Use
user_input- inject text as if user spoke
Turn-Taking Sensitivity
Control how quickly the agent responds to pauses:
model = aws.realtime.RealtimeModel.with_nova_sonic_2(
turn_detection="MEDIUM" # HIGH, MEDIUM (default), LOW
)
- HIGH: Fastest response time, optimized for latency. May interrupt slower speakers
- MEDIUM: Balanced approach with moderate response time. Reduces false positives while maintaining responsiveness (recommended)
- LOW: Slowest response time with maximum patience, better for hesitant speakers
Complete Example
from livekit import agents
from livekit.agents import Agent, AgentSession
from livekit.plugins import aws
from dotenv import load_dotenv
load_dotenv()
class Assistant(Agent):
def __init__(self):
super().__init__(
instructions="You are a helpful voice assistant powered by Amazon Nova 2 Sonic."
)
async def on_enter(self):
await self.session.generate_reply(
instructions="Greet the user and offer assistance"
)
server = agents.AgentServer()
@server.rtc_session()
async def entrypoint(ctx: agents.JobContext):
await ctx.connect()
session = AgentSession(
llm=aws.realtime.RealtimeModel.with_nova_sonic_2(
voice="matthew",
turn_detection="MEDIUM",
tool_choice="auto"
)
)
await session.start(room=ctx.room, agent=Assistant())
if __name__ == "__main__":
agents.cli.run_app(server)
Pipeline Mode (STT + LLM + TTS)
For more control over individual components, use pipeline mode:
from livekit.plugins import aws, silero
session = AgentSession(
stt=aws.STT(), # Amazon Transcribe
llm=aws.LLM(), # Nova 2 Lite (default)
tts=aws.TTS(), # Amazon Polly
vad=silero.VAD.load(),
)
Nova 2 Lite
Amazon Nova 2 Lite is a fast, cost-effective reasoning model optimized for everyday AI workloads:
- Lightning-fast processing - Very low latency for real-time conversations
- Cost-effective - Industry-leading price-performance
- Multimodal inputs - Text, image, and video (documentation)
- 1 million token context window - Handle long conversations and complex context (source)
- Agentic workflows - RAG systems, function calling, tool use
- Fine-tuning support - Customize for your specific use case
Ideal for pipeline mode where you need fast, accurate LLM responses in voice applications.
Resources
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file livekit_plugins_aws-1.4.1.tar.gz.
File metadata
- Download URL: livekit_plugins_aws-1.4.1.tar.gz
- Upload date:
- Size: 43.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
28fa4367bb59d55804eb7066ecc919f7559c20bd5e578969b0f7902623fdef14
|
|
| MD5 |
fa1f60ce6b529e674567d47e74e815af
|
|
| BLAKE2b-256 |
4621404f2d79e7b09d1856262d9e2c6c15a9d48318578979860cd7f35e12b509
|
File details
Details for the file livekit_plugins_aws-1.4.1-py3-none-any.whl.
File metadata
- Download URL: livekit_plugins_aws-1.4.1-py3-none-any.whl
- Upload date:
- Size: 48.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
534d785604bacaaeee5aaf1595e44a248a4ba05fb21374660c805a3886f5a163
|
|
| MD5 |
e4e16e7967e9b15a6bc78f24c15fdf6e
|
|
| BLAKE2b-256 |
edf97b8487d9d42bea5bf4fed70be6026aadcc285329558aa69d254f3d334878
|