Quelvio for LlamaIndex — your company's brain as a LlamaIndex retriever.
Project description
llama-index-retrievers-quelvio
Quelvio for LlamaIndex — your company's brain as a LlamaIndex retriever.
llama-index-retrievers-quelvio is the official Python integration that plugs
Quelvio's enterprise knowledge API into LlamaIndex.
It ships a first-class BaseRetriever wired to your organization's connected
sources (Google Drive, SharePoint, Confluence, Slack, Notion, and the rest of
your content fabric) and scoped to the running user's individual permissions.
Why Quelvio (and not vanilla RAG)?
A naive RAG pipeline embeds every chunk it can find and ranks by cosine similarity. That's why most internal copilots confidently quote a three-year-old draft. Quelvio is a managed company-brain that does the work a generic vector store can't:
- Authority scoring. Every chunk is ranked by who authored it, how fresh it is, and how many downstream documents reference it — not just semantic similarity to the question.
- Lifecycle awareness. Drafts, deprecated docs, and superseded
decisions are demoted automatically; chunks return a
lifecycle_statethe LLM can quote when hedging. - Per-employee permissioning. Every query is scoped to the running user's identity. Results never include documents the user can't already read in the source system (Drive ACLs, Confluence space restrictions, SharePoint groups).
- Synthesized answers with citations. The API returns a final answer plus the chunks that informed it, so your agent can hand the user a link to the source of truth, not a hallucination.
Install
pip install llama-index-retrievers-quelvio llama-index-core
Requires Python 3.10+ and llama-index-core>=0.11,<0.13.
Quickstart
from llama_index_retrievers_quelvio import QuelvioRetriever
retriever = QuelvioRetriever(api_key="qlv_pat_...") # or set QUELVIO_API_KEY
nodes = retriever.retrieve("what's our refund policy?")
for n in nodes:
print(n.score, n.metadata["source"], n.text)
Each returned NodeWithScore carries the chunk's source_url,
authority_score, taxonomy_domain, chunk_id, and (when present) the
author's name, email, and department on the underlying
TextNode.metadata. The score is the chunk's authority score when
available, so a downstream node post-processor or SimilarityPostprocessor
can filter on it directly.
Authentication
llama-index-retrievers-quelvio resolves a bearer token from the first
non-empty source, in order:
| Precedence | Source | Notes |
|---|---|---|
| 1 | api_key=... constructor arg |
Highest priority; never persisted, never logged. |
| 2 | QUELVIO_API_KEY env var |
Best for CI, notebooks, and one-off scripts. |
Three token types are accepted — the wire format is identical, so the library does not need to know which kind you provided:
- Personal Access Token (PAT). Long-lived bearer tied to a human user. Generate at https://enterprise.quelvio.com/account → Personal API Keys → Create token. Best for ad-hoc use and CI.
- OAuth access token. Short-lived token from the device-code flow
(
quelvio loginin the CLI). - Service Account key. Long-lived, machine-scoped. Generate at Settings → Service Accounts. Best for production agents.
The token is held privately on the client; it never appears in
repr(), exception messages, or any log line emitted by this library.
Configuration
| Constructor arg / env var | Default | Purpose |
|---|---|---|
api_key / QUELVIO_API_KEY |
(required) | Bearer token (PAT, OAuth, or Service Account). |
base_url / QUELVIO_API_BASE |
https://api.quelvio.com |
API base — point at api-dev for staging. |
timeout |
30.0 seconds |
Per-request HTTP timeout. |
max_sources |
5 |
Max chunks returned per query (1–50). |
mode |
"standard" |
fast / standard / deep. |
domain |
None |
Restrict to one taxonomy domain. |
Examples
1. Simple Q&A with citations
from llama_index_retrievers_quelvio import QuelvioRetriever
retriever = QuelvioRetriever() # reads QUELVIO_API_KEY
nodes = retriever.retrieve("how do we handle on-call escalations?")
for n in nodes:
title = n.metadata["title"]
url = n.metadata.get("source_url", "(no link)")
authority = n.metadata.get("authority_score", "—")
print(f"[authority {authority}] {title}\n {url}\n {n.text[:160]}\n")
2. RAG QueryEngine using QuelvioRetriever + LLM
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.response_synthesizers import get_response_synthesizer
from llama_index.llms.anthropic import Anthropic
from llama_index_retrievers_quelvio import QuelvioRetriever
retriever = QuelvioRetriever(mode="deep", max_sources=8)
llm = Anthropic(model="claude-sonnet-4-6")
synthesizer = get_response_synthesizer(llm=llm, response_mode="compact")
query_engine = RetrieverQueryEngine(
retriever=retriever,
response_synthesizer=synthesizer,
)
response = query_engine.query("Summarize our Q4 OKR review decisions.")
print(response)
for src in response.source_nodes:
print(f" • {src.metadata['title']} — {src.metadata.get('source_url', '(no url)')}")
3. Multi-step LlamaIndex agent using QuelvioRetriever as a knowledge tool
from llama_index.core.agent.workflow import FunctionAgent
from llama_index.core.tools import RetrieverTool, ToolMetadata
from llama_index.llms.anthropic import Anthropic
from llama_index_retrievers_quelvio import QuelvioRetriever
retriever = QuelvioRetriever(mode="standard", max_sources=6)
quelvio_tool = RetrieverTool(
retriever=retriever,
metadata=ToolMetadata(
name="quelvio_query",
description=(
"Look up factual information from THIS company's internal knowledge "
"base — policies, decisions, on-call runbooks, OKRs, customer "
"playbooks. Use whenever the user asks about internal company info."
),
),
)
agent = FunctionAgent(
tools=[quelvio_tool],
llm=Anthropic(model="claude-sonnet-4-6"),
system_prompt=(
"You are an internal assistant. Use the quelvio_query tool whenever "
"the user asks about anything company-specific. Always cite source URLs."
),
)
print(await agent.run("What's our parental leave policy?"))
Authority and lifecycle awareness
A naive RAG pipeline embeds every chunk it can find and ranks by cosine
similarity. That's why most internal copilots confidently quote a
three-year-old draft. Quelvio is a managed company-brain that does the
work a generic vector store can't: every chunk is ranked by who
authored it, how fresh it is, and how many downstream documents
reference it — not just semantic similarity to the question. Drafts,
deprecated docs, and superseded decisions are demoted automatically;
chunks return a lifecycle_state the LLM can quote when hedging. Every
query is scoped to the running user's identity, so results never include
documents the user can't already read in the source system (Drive ACLs,
Confluence space restrictions, SharePoint groups).
Related packages
quelvio-langchain— the LangChain Python integration (sibling package).@quelvio/langchain— the LangChain.js integration.@quelvio/cli— query the brain from your terminal, scriptable in CI, JSON output.@quelvio/mcp-server— use Quelvio from any Model Context Protocol client (Claude Desktop, Cursor, VS Code, etc.).- Quelvio docs — concepts, API reference, source connectors.
Development
git clone https://github.com/Quelvio/quelvio-llama-index
cd quelvio-llama-index
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
pytest
Linting and type-checking:
ruff check src tests
ruff format --check src tests
mypy src
Contributing
Issues and pull requests welcome at
https://github.com/Quelvio/quelvio-llama-index. Please run ruff check,
ruff format, mypy, and pytest before opening a PR.
License
MIT — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llama_index_retrievers_quelvio-0.1.0.tar.gz.
File metadata
- Download URL: llama_index_retrievers_quelvio-0.1.0.tar.gz
- Upload date:
- Size: 17.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
20e36f5f3745387a63d8ae19150e5e60113d0f7265acd065abb3a951c05f4e54
|
|
| MD5 |
2b4d7c5a9fbed518c2c5f5e52df16e6b
|
|
| BLAKE2b-256 |
33902d17ac58bb0444e80bb99f43cd85fcba6d80f7e86af1d444915caa88d301
|
Provenance
The following attestation bundles were made for llama_index_retrievers_quelvio-0.1.0.tar.gz:
Publisher:
release.yml on Quelvio/quelvio-llama-index
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llama_index_retrievers_quelvio-0.1.0.tar.gz -
Subject digest:
20e36f5f3745387a63d8ae19150e5e60113d0f7265acd065abb3a951c05f4e54 - Sigstore transparency entry: 1608173955
- Sigstore integration time:
-
Permalink:
Quelvio/quelvio-llama-index@b51d0d7e251b77a36ae0868f8080de039465a098 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/Quelvio
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@b51d0d7e251b77a36ae0868f8080de039465a098 -
Trigger Event:
push
-
Statement type:
File details
Details for the file llama_index_retrievers_quelvio-0.1.0-py3-none-any.whl.
File metadata
- Download URL: llama_index_retrievers_quelvio-0.1.0-py3-none-any.whl
- Upload date:
- Size: 15.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d82ea18d34855482ddd5ed1aaeb4ce6d68e5b023db8dfcdc20ceddbfc63b9289
|
|
| MD5 |
9d9681ce8703935eb07000941aa710d5
|
|
| BLAKE2b-256 |
173e00d5f446d8e70843a5b10a8c17789e4c0b241831ab4dc9475b9224d71bd4
|
Provenance
The following attestation bundles were made for llama_index_retrievers_quelvio-0.1.0-py3-none-any.whl:
Publisher:
release.yml on Quelvio/quelvio-llama-index
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llama_index_retrievers_quelvio-0.1.0-py3-none-any.whl -
Subject digest:
d82ea18d34855482ddd5ed1aaeb4ce6d68e5b023db8dfcdc20ceddbfc63b9289 - Sigstore transparency entry: 1608174082
- Sigstore integration time:
-
Permalink:
Quelvio/quelvio-llama-index@b51d0d7e251b77a36ae0868f8080de039465a098 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/Quelvio
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@b51d0d7e251b77a36ae0868f8080de039465a098 -
Trigger Event:
push
-
Statement type: