Provider-agnostic voice RAG pipeline. Plug in your voice provider, LLM, vector store, and document parsers.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

gaurangtorvekar

These details have not been verified by PyPI

Project description

voice-rag

Ingest your docs. Answer questions by voice. Deploy in minutes.

voice-rag is a Python library and CLI for building voice-powered RAG pipelines. Point it at a folder of documents, choose your LLM and voice provider, and get an OpenAI-compatible webhook ready to wire into ElevenLabs, Deepgram, or any voice platform.

pip install "voice-rag[elevenlabs]"
export OPENAI_API_KEY=sk-...
voice-rag init && voice-rag ingest ./docs --recreate && voice-rag serve
# → serving at http://localhost:8000/v1

What it does

your docs  →  Qdrant (hybrid dense + BM25)  →  retrieved chunks
                                                       ↓
voice platform  →  speech-to-text  →  /v1/chat/completions  →  LLM  →  TTS

Each turn from your voice platform hits the webhook, embeds the user utterance, retrieves the most relevant chunks, injects them into the system prompt, and streams the LLM response back as SSE — all in one pip install.

Install

# ElevenLabs voice + OpenAI LLM (most common)
pip install "voice-rag[elevenlabs]"

# All providers
pip install "voice-rag[all]"

# Pick only what you need
pip install "voice-rag[anthropic,pdf]"

Extra	Adds
`elevenlabs`	ElevenLabs voice adapter
`deepgram`	Deepgram voice adapter
`anthropic`	Anthropic (Claude) LLM client
`gemini`	Google Gemini LLM client
`pdf`	PDF parser (PyMuPDF)
`docx`	Word document parser
`all`	Everything above

Quickstart

# 1. Create a config file
voice-rag init

# 2. Ingest your documents (supports .md, .txt, .pdf, .docx)
voice-rag ingest ./docs --recreate

# 3. Start the webhook server
voice-rag serve

Point your ElevenLabs agent's Custom LLM URL to http://localhost:8000/v1.

By default, vectors are stored locally in .qdrant — no separate Qdrant server needed. Set vector_store.url to connect to a remote instance.

CLI reference

voice-rag init [--dir PATH]               # create voice-rag.yaml
voice-rag ingest <path> [--recreate]      # ingest a file or directory
voice-rag serve [--host] [--port] [--reload]
voice-rag query <text> [--limit N]        # test retrieval without a server
voice-rag inspect                         # show collection stats
voice-rag doctor                          # check API keys and Qdrant connectivity

Python API

from voice_rag import KnowledgeAgent, VoiceRagConfig

config = VoiceRagConfig()          # reads from voice-rag.yaml or env vars
agent = KnowledgeAgent(config=config)

agent.ingest("./docs", recreate=True)

app = agent.create_app()           # returns a FastAPI app
# run with: uvicorn app:app --port 8000

Configuration

Config is loaded from voice-rag.yaml (run voice-rag init to generate one) or environment variables. Environment variables override the YAML file.

Key	Env var	Default
`llm.provider`	`LLM_PROVIDER`	`openai`
`llm.model`	`LLM_MODEL`	`gpt-4o-mini`
`llm.api_key` / `embedding.api_key`	`OPENAI_API_KEY`	—
`llm.base_url`	`LLM_BASE_URL`	`https://api.openai.com/v1`
`embedding.model`	`EMBEDDING_MODEL`	`text-embedding-3-small`
`vector_store.url`	`VECTOR_STORE_URL`	empty → local `.qdrant`
`vector_store.collection_name`	`VECTOR_STORE_COLLECTION_NAME`	`knowledge_base`
`server.port`	`SERVER_PORT`	`8000`
`server.enable_debug_retrieval`	`SERVER_ENABLE_DEBUG_RETRIEVAL`	`false`

See voice-rag.yaml for the full annotated schema.

Providers

Category	Supported
LLM	OpenAI, Anthropic, Gemini (any OpenAI-compatible URL via `llm.base_url`)
Voice	ElevenLabs, Deepgram
Embeddings	OpenAI
Vector store	Qdrant (local embedded or remote)
Parsers	`.txt`, `.md`, `.pdf`, `.docx`

Starter kit

Want a full working demo with a Next.js frontend and Railway deploy button? See kytona/elevenlabs-knowledge-agent — a thin wrapper around voice-rag with an ElevenLabs voice UI.

Development

git clone https://github.com/kytona/voice-rag
cd voice-rag
pip install -e ".[all,dev]"
pytest tests/ -v

See CONTRIBUTING.md for how to add new LLM, voice, embedding, or vector store connectors.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

gaurangtorvekar

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.3

Mar 17, 2026

0.1.2

Mar 17, 2026

This version

0.1.1

Mar 17, 2026

0.1.0

Mar 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voice_rag-0.1.1.tar.gz (97.7 kB view details)

Uploaded Mar 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

voice_rag-0.1.1-py3-none-any.whl (30.1 kB view details)

Uploaded Mar 17, 2026 Python 3

File details

Details for the file voice_rag-0.1.1.tar.gz.

File metadata

Download URL: voice_rag-0.1.1.tar.gz
Upload date: Mar 17, 2026
Size: 97.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for voice_rag-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`38d760f06ee71f8befce9a36e654d704945b37ebbe257b3bf804a82d3943c592`
MD5	`3f62a23e395a49b7503c7cec47927e4a`
BLAKE2b-256	`f6bcc81ab72fdc706404e4116a34b10eefdc699b9ee8def39d4d66212270da96`

See more details on using hashes here.

Provenance

The following attestation bundles were made for voice_rag-0.1.1.tar.gz:

Publisher: publish.yml on kytona/voice-rag

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: voice_rag-0.1.1.tar.gz
- Subject digest: 38d760f06ee71f8befce9a36e654d704945b37ebbe257b3bf804a82d3943c592
- Sigstore transparency entry: 1114966448
- Sigstore integration time: Mar 17, 2026
Source repository:
- Permalink: kytona/voice-rag@c45b7fdf80d2f463b2b4e415711fd812d0caa961
- Branch / Tag: refs/tags/v0.1.1
- Owner: https://github.com/kytona
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@c45b7fdf80d2f463b2b4e415711fd812d0caa961
- Trigger Event: push

File details

Details for the file voice_rag-0.1.1-py3-none-any.whl.

File metadata

Download URL: voice_rag-0.1.1-py3-none-any.whl
Upload date: Mar 17, 2026
Size: 30.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for voice_rag-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ad690672fb91265b2126fe1a31f7c6da2e51f5986ac2c37fcfe7f4199516e677`
MD5	`554a9ed72bd53ccb158b48274a619fcf`
BLAKE2b-256	`fad913335bfad61adc7f8bb8bb1e744fc116fa6e2939a2507254f271d633b716`

See more details on using hashes here.

Provenance

The following attestation bundles were made for voice_rag-0.1.1-py3-none-any.whl:

Publisher: publish.yml on kytona/voice-rag

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: voice_rag-0.1.1-py3-none-any.whl
- Subject digest: ad690672fb91265b2126fe1a31f7c6da2e51f5986ac2c37fcfe7f4199516e677
- Sigstore transparency entry: 1114966452
- Sigstore integration time: Mar 17, 2026
Source repository:
- Permalink: kytona/voice-rag@c45b7fdf80d2f463b2b4e415711fd812d0caa961
- Branch / Tag: refs/tags/v0.1.1
- Owner: https://github.com/kytona
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@c45b7fdf80d2f463b2b4e415711fd812d0caa961
- Trigger Event: push

voice-rag 0.1.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

voice-rag

What it does

Install

Quickstart

CLI reference

Python API

Configuration

Providers

Starter kit

Development

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance