Skip to main content

A modular voice agent with swappable STT/TTS/LLM backends

Project description

fastrtc-voice-agent

A modular voice agent built on FastRTC

Installation

Recommended: Using uv

uv is the recommended way to manage your Python environment and dependencies.

# Create a virtual environment with Python 3.12+
uv venv --python 3.12

# Activate the environment
source .venv/bin/activate

# Install the package
uv add fastrtc-voice-agent

# Install with optional dependencies (e.g., ollama)
uv add "fastrtc-voice-agent[ollama]"

# Or install all optional dependencies
uv add "fastrtc-voice-agent[all]"

Using pip

pip install fastrtc-voice-agent

# Install your desired STT and LLM with (for example):
pip install "fastrtc-voice-agent[ollama]"

# Or for all optional dependencies:
pip install "fastrtc-voice-agent[all]"

CLI Usage Example

For default config :

fastrtc-voice-agent --run

Please refere to the help for custom config :

fastrtc-voice-agent --help

Python Usage Example

from fastrtc import ReplyOnPause, Stream
from voice_agent import create_agent, AgentConfig, STTConfig, TTSConfig, LLMConfig

config = AgentConfig(
    system_prompt="You are a helpful voice assistant.",
    stt=STTConfig(backend="faster_whisper", model_size="small"),
    tts=TTSConfig(backend="edge", voice="en-US-AvaMultilingualNeural"),
    llm=LLMConfig(backend="ollama", model="llama3.2:3b"),
)

agent = create_agent(config)

stream = Stream(
    ReplyOnPause(agent.create_fastrtc_handler()),
    modality="audio",
    mode="send-receive",
)

stream.ui.launch()

Custom Frontend Integration

If you want to use your own frontend (React, Vue, etc.) instead of the built-in Gradio UI, you can run the agent as an API server.

CLI - API Mode

# Install with API support
pip install "fastrtc-voice-agent[api]"

# Run as API server (no Gradio UI)
fastrtc-voice-agent --run --api --port 8000

This exposes WebRTC endpoints:

  • POST /webrtc/offer - WebRTC signaling
  • WS /websocket/offer - WebSocket alternative

Python - API Server

from voice_agent import create_api_server, AgentConfig, STTConfig, TTSConfig, LLMConfig

# Create a FastAPI app with the voice agent
app = create_api_server(
    config=AgentConfig(
        system_prompt="You are a helpful assistant.",
        stt=STTConfig(backend="faster_whisper"),
        tts=TTSConfig(backend="edge"),
        llm=LLMConfig(backend="ollama"),
    )
)

# Run with: uvicorn main:app --host 0.0.0.0 --port 8000

You can also mount it in an existing FastAPI app:

from fastapi import FastAPI
from voice_agent import create_api_server

main_app = FastAPI()
voice_app = create_api_server()
main_app.mount("/voice", voice_app)

React Example

See the examples/react-client directory for a complete React example with a useVoiceAgent hook.

Quick example:

import { useVoiceAgent } from "./useVoiceAgent";

function App() {
  const { isConnected, connect, disconnect } = useVoiceAgent({
    serverUrl: "http://localhost:8000",
  });

  return (
    <button onClick={isConnected ? disconnect : connect}>
      {isConnected ? "Stop" : "Start"}
    </button>
  );
}

Note

To use Anthropic API (may be OpenAI or else later) please copy .env.example as .env file and fill it with your API KEY and the desired model

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastrtc_voice_agent-0.2.0.tar.gz (12.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fastrtc_voice_agent-0.2.0-py3-none-any.whl (18.2 kB view details)

Uploaded Python 3

File details

Details for the file fastrtc_voice_agent-0.2.0.tar.gz.

File metadata

  • Download URL: fastrtc_voice_agent-0.2.0.tar.gz
  • Upload date:
  • Size: 12.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.14

File hashes

Hashes for fastrtc_voice_agent-0.2.0.tar.gz
Algorithm Hash digest
SHA256 84558a1f33e7df831c13128c2a52adb805bea57a820277fdb2893ab95ee58682
MD5 02efc07ef26d27e27a1fc2dac5a06c37
BLAKE2b-256 65154f057c0d0f7637bb6357ccedcce30a4f578b542bdc11cc151800049ff9e3

See more details on using hashes here.

File details

Details for the file fastrtc_voice_agent-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for fastrtc_voice_agent-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6a1a4c356935af15226e545072331762871d500a6e8543363a10515723032e61
MD5 6708c5ea9b4af4f90e3bb6fdbf8bfe6b
BLAKE2b-256 9b0c4028f5e6f65177a08e718672b2da6b1cbe1d899413796f62b273133a2d8e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page