Skip to main content

A modular voice agent with swappable STT/TTS/LLM backends

Project description

fastrtc-voice-agent

A modular voice agent built on FastRTC

Installation

Recommended: Using uv

uv is the recommended way to manage your Python environment and dependencies.

# Create a virtual environment with Python 3.12+
uv venv --python 3.12

# Activate the environment
source .venv/bin/activate

# Install the package
uv add fastrtc-voice-agent

# Install with optional dependencies (e.g., ollama)
uv add "fastrtc-voice-agent[ollama]"

# Or install all optional dependencies
uv add "fastrtc-voice-agent[all]"

Using pip

pip install fastrtc-voice-agent

# Install your desired STT and LLM with (for example):
pip install "fastrtc-voice-agent[ollama]"

# Or for all optional dependencies:
pip install "fastrtc-voice-agent[all]"

CLI Usage Example

For default config :

fastrtc-voice-agent --run

Please refere to the help for custom config :

fastrtc-voice-agent --help

Python Usage Example

from fastrtc import ReplyOnPause, Stream
from voice_agent import create_agent, AgentConfig, STTConfig, TTSConfig, LLMConfig

config = AgentConfig(
    system_prompt="You are a helpful voice assistant.",
    stt=STTConfig(backend="faster_whisper", model_size="small"),
    tts=TTSConfig(backend="edge", voice="en-US-AvaMultilingualNeural"),
    llm=LLMConfig(backend="ollama", model="llama3.2:3b"),
)

agent = create_agent(config)

stream = Stream(
    ReplyOnPause(agent.create_fastrtc_handler()),
    modality="audio",
    mode="send-receive",
)

stream.ui.launch()

Custom Frontend Integration

If you want to use your own frontend (React, Vue, etc.) instead of the built-in Gradio UI, you can run the agent as an API server.

CLI - API Mode

# Install with API support
pip install "fastrtc-voice-agent[api]"

# Run as API server (no Gradio UI)
fastrtc-voice-agent --run --api --port 8000

This exposes WebRTC endpoints:

  • POST /webrtc/offer - WebRTC signaling
  • WS /websocket/offer - WebSocket alternative

Python - API Server

from voice_agent import create_api_server, AgentConfig, STTConfig, TTSConfig, LLMConfig

# Create a FastAPI app with the voice agent
app = create_api_server(
    config=AgentConfig(
        system_prompt="You are a helpful assistant.",
        stt=STTConfig(backend="faster_whisper"),
        tts=TTSConfig(backend="edge"),
        llm=LLMConfig(backend="ollama"),
    )
)

# Run with: uvicorn main:app --host 0.0.0.0 --port 8000

You can also mount it in an existing FastAPI app:

from fastapi import FastAPI
from voice_agent import create_api_server

main_app = FastAPI()
voice_app = create_api_server()
main_app.mount("/voice", voice_app)

React Example

See the examples/react-client directory for a complete React example with a useVoiceAgent hook.

Quick example:

import { useVoiceAgent } from "./useVoiceAgent";

function App() {
  const { isConnected, connect, disconnect } = useVoiceAgent({
    serverUrl: "http://localhost:8000",
  });

  return (
    <button onClick={isConnected ? disconnect : connect}>
      {isConnected ? "Stop" : "Start"}
    </button>
  );
}

Note

To use Anthropic API (may be OpenAI or else later) please copy .env.example as .env file and fill it with your API KEY and the desired model

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastrtc_voice_agent-0.2.1.tar.gz (12.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fastrtc_voice_agent-0.2.1-py3-none-any.whl (18.5 kB view details)

Uploaded Python 3

File details

Details for the file fastrtc_voice_agent-0.2.1.tar.gz.

File metadata

  • Download URL: fastrtc_voice_agent-0.2.1.tar.gz
  • Upload date:
  • Size: 12.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.14

File hashes

Hashes for fastrtc_voice_agent-0.2.1.tar.gz
Algorithm Hash digest
SHA256 6926cafe39dea0872bde3b01ca55dbfd47cd5ae7c8ba47790a5afb8799ef9e8f
MD5 ea5b70dfd1fbfead6f49eb08f8823b38
BLAKE2b-256 0c0246b9b8b33d97205a967d4f97d49ba985dd48bd0509d515bbbbae5d281446

See more details on using hashes here.

File details

Details for the file fastrtc_voice_agent-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for fastrtc_voice_agent-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4d478a3969cc0e279e34a79f116b75e948fd5fd48b75b1b1ae2264d59393ca43
MD5 bdea11fa511c4303436e76c9c1f36970
BLAKE2b-256 696f5055ef7968c1eca91713c81556fc833d129236aeadbf9b01a5e8f39e299d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page