Skip to main content

A modular voice agent with swappable STT/TTS/LLM backends

Project description

fastrtc-voice-agent

A modular voice agent built on FastRTC

Installation

Recommended: Using uv

uv is the recommended way to manage your Python environment and dependencies.

# Create a virtual environment with Python 3.12+
uv venv --python 3.12

# Activate the environment
source .venv/bin/activate

# Install the package
uv add fastrtc-voice-agent

# Install with optional dependencies (e.g., ollama)
uv add "fastrtc-voice-agent[ollama]"

# Or install all optional dependencies
uv add "fastrtc-voice-agent[all]"

Using pip

pip install fastrtc-voice-agent

# Install your desired STT and LLM with (for example):
pip install "fastrtc-voice-agent[ollama]"

# Or for all optional dependencies:
pip install "fastrtc-voice-agent[all]"

CLI Usage Example

For default config :

fastrtc-voice-agent --run

Please refere to the help for custom config :

fastrtc-voice-agent --help

Python Usage Example

from fastrtc import ReplyOnPause, Stream
from voice_agent import create_agent, AgentConfig, STTConfig, TTSConfig, LLMConfig

config = AgentConfig(
    system_prompt="You are a helpful voice assistant.",
    stt=STTConfig(backend="faster_whisper", model_size="small"),
    tts=TTSConfig(backend="edge", voice="en-US-AvaMultilingualNeural"),
    llm=LLMConfig(backend="ollama", model="llama3.2:3b"),
)

agent = create_agent(config)

stream = Stream(
    ReplyOnPause(agent.create_fastrtc_handler()),
    modality="audio",
    mode="send-receive",
)

stream.ui.launch()

Custom Frontend Integration

If you want to use your own frontend (React, Vue, etc.) instead of the built-in Gradio UI, you can run the agent as an API server.

CLI - API Mode

# Install with API support
pip install "fastrtc-voice-agent[api]"

# Run as API server (no Gradio UI)
fastrtc-voice-agent --run --api --port 8000

This exposes WebRTC endpoints:

  • POST /webrtc/offer - WebRTC signaling
  • WS /websocket/offer - WebSocket alternative

Python - API Server

from voice_agent import create_api_server, AgentConfig, STTConfig, TTSConfig, LLMConfig

# Create a FastAPI app with the voice agent
app = create_api_server(
    config=AgentConfig(
        system_prompt="You are a helpful assistant.",
        stt=STTConfig(backend="faster_whisper"),
        tts=TTSConfig(backend="edge"),
        llm=LLMConfig(backend="ollama"),
    )
)

# Run with: uvicorn main:app --host 0.0.0.0 --port 8000

You can also mount it in an existing FastAPI app:

from fastapi import FastAPI
from voice_agent import create_api_server

main_app = FastAPI()
voice_app = create_api_server()
main_app.mount("/voice", voice_app)

React Example

See the examples/react-client directory for a complete React example with a useVoiceAgent hook.

Quick example:

import { useVoiceAgent } from "./useVoiceAgent";

function App() {
  const { isConnected, connect, disconnect } = useVoiceAgent({
    serverUrl: "http://localhost:8000",
  });

  return (
    <button onClick={isConnected ? disconnect : connect}>
      {isConnected ? "Stop" : "Start"}
    </button>
  );
}

Note

To use Anthropic API (may be OpenAI or else later) please copy .env.example as .env file and fill it with your API KEY and the desired model

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastrtc_voice_agent-0.1.6.tar.gz (11.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fastrtc_voice_agent-0.1.6-py3-none-any.whl (17.1 kB view details)

Uploaded Python 3

File details

Details for the file fastrtc_voice_agent-0.1.6.tar.gz.

File metadata

  • Download URL: fastrtc_voice_agent-0.1.6.tar.gz
  • Upload date:
  • Size: 11.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for fastrtc_voice_agent-0.1.6.tar.gz
Algorithm Hash digest
SHA256 560ea77c665c733c53ad352476169954b18e697e5d913984833ff660100e0bc7
MD5 9c1d2a9acd28e503cdb7755605428d81
BLAKE2b-256 0bd30a0c4964eb44a66921b9ce5c857bc465078f1eea324e4a903a2f809b1f62

See more details on using hashes here.

File details

Details for the file fastrtc_voice_agent-0.1.6-py3-none-any.whl.

File metadata

File hashes

Hashes for fastrtc_voice_agent-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 b0d09979270b76778f4aa39ee3ba06f0e35d3d2459a0e40cc49c7200e029eb8e
MD5 ccb32c7e679f52ed73d968fa1ad6da3a
BLAKE2b-256 c1a531ed12dfa28820f5e068d14805fce77d89d490ce70b795d8d2e0d6bc8bac

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page