Evidence-first, multilingual, S3-native RAG for domains where a wrong answer is worse than no answer.
Project description
CiteNexus
Multilingual RAG that answers only when the evidence is strong.
Evidence-first, multilingual, S3-native RAG for domains where a wrong answer is worse than no answer (legal, medical, finance/compliance, enterprise search). CiteNexus answers only from retrieved evidence — every claim is grounded in a bbox-cited source passage, and it refuses or states uncertainty when evidence is weak, missing, or conflicting. The guarantee is "no ungrounded claim," not "zero hallucination."
The library bundles no models — embedding, LLM, reranker, and vision are injected endpoints. CiteNexus owns orchestration, storage, retrieval, fusion, grounding, and evaluation.
CiteNexus supports pluggable vector databases. Storage is two protocols —
VectorStore (dense) and TextSearch (lexical) — and each backend is a named
(vector, text) pair:
| Backend | Vector | Text | When |
|---|---|---|---|
| Lance (recommended) | LanceVectorStore |
LanceTextSearch (BM25-lite) |
Zero infra, S3-native: point at a bucket and go |
| Postgres | PostgresVectorStore (pgvector) |
PostgresTextSearch (native tsvector) |
You already run Postgres — pip install 'citenexus[postgres]', set vector_store.backend: "postgres" |
| Yours | implement VectorStore |
implement TextSearch |
Qdrant, Weaviate, Elasticsearch, Tantivy, … |
The seams are independent: mix LanceDB vectors with an Elasticsearch
text_search=, or let one Postgres serve both.
from citenexus import CiteNexus, S3
rag = CiteNexus(
S3(bucket="my-bucket"),
embedder=my_embedding_endpoint,
generator=my_llm_endpoint,
)
rag.ingest("policy.pdf") # any supported input type
answer = rag.ask("Can the employee disclose this information?")
print(answer) # grounded answer; .sources are bbox-cited
The client scales to exactly what you give it — every model is optional, and every rung below is additive (nothing above it changes):
rag = CiteNexus("./data") # a folder…
rag = CiteNexus(S3(bucket="docs", # …or real S3/MinIO/R2: ONE object
endpoint_url="https://<r2>.cloudflarestorage.com"))
# carries endpoint + credential
# env-var names for BOTH stores
# ZERO models — already FOUR retrieval signals, fused with RRF:
# text (BM25) · structure (heading tree) · graph (co-mention) · wiki (page nav)
rag.ingest("handbook.pdf")
rag.retrieve("termination notice") # works immediately, cited rows
rag = CiteNexus("./data", embedder=e) # + vector signal (5-way hybrid RRF)
rag = CiteNexus("./data", ..., generator=g) # + ask()/stream()/evaluate() — cite-or-abstain
rag = CiteNexus("./data", ..., reranker=r) # + cross-encoder ordering of the fused pool
rag = CiteNexus("./data", ..., wiki_distiller=w) # wiki pages become LLM-distilled,
# cross-linked concept pages (+ Markdown tree in S3)
rag = CiteNexus("./data", ..., contextualizer=c) # + Anthropic-style contextual chunk prefixes
rag = CiteNexus("./data", ..., reformulator=q) # + EN dual-query RRF (cross-lingual recall)
rag = CiteNexus("./data", ..., vision=v) # + images in PDFs/docs become described, cited evidence
rag = CiteNexus("./data", ..., detector=d) # + real lid.176 language detection
rag = CiteNexus("./data", ..., sink=s, hooks=h) # + telemetry (tokens/cost) + lifecycle hooks
rag = CiteNexus("./data", ..., vector_store=pg, text_search=es) # + bring your own stores
rag.ask("...", conversation_id="c1") # conversation memory — built in, no param
Or declare it all in one typed config: CiteNexus.from_config(cfg) builds only
what the config enables. ask() without a generator raises a clear error
pointing at retrieve() — search-only deployments are first-class, not a crash.
Capability status (honest):
| Capability | Status |
|---|---|
| text (BM25) · structure · graph · wiki · vector · RRF fusion | ✅ shipped, zero-model tier included |
| ask/stream/evaluate with per-claim faithfulness gate | ✅ shipped (generator required) |
LLM wiki distillation (concept pages, [[links]], S3 Markdown tree, lint) |
✅ shipped (wiki_distiller=) |
| Contextual chunking · dual-query RRF · vision-into-evidence · hooks · telemetry · web crawl · Postgres backend | ✅ shipped |
| LLM graph extraction (entity/relation model behind the graph signal) | ⏳ not yet — graph is deterministic co-mention; the GraphExtractorPlugin seam exists, no LLM impl |
| Leiden community clustering | ⏳ not yet (community signal rides the graph retriever) |
| True BGE-M3 sparse lexical | ⏳ BM25-lite stands in (needs a sparse-capable endpoint) |
| Image bytes from real PDFs for vision | ⏳ extractors don't persist rasters yet (vision path proven with injected bytes) |
| LLM-as-judge · MCP server | ⏳ later (config sections reserved) |
Or wire real OpenAI-compatible endpoints from typed config — one call builds the embedding / answering-LLM / reranker plugins (answers stay temperature-0):
import os
from citenexus import CiteNexus, GeminiHttpEndpoint, OpenAIHttpEndpoint
from citenexus.config.schema import EmbeddingConfig, LLMConfig, StorageConfig, CiteNexusConfig
# YOUR app reads its environment — the library never touches env vars.
jina = OpenAIHttpEndpoint(base_url="https://api.jina.ai/v1",
api_key=os.environ["JINA_API_KEY"])
gemini = GeminiHttpEndpoint(api_key=os.environ["GEMINI_API_KEY"]) # SecretStr — repr/log-safe
config = CiteNexusConfig(
storage=StorageConfig(bucket="./data"), # or "s3://bucket"
embedding=EmbeddingConfig(endpoint=jina, model="jina-embeddings-v3"),
llm=LLMConfig(endpoint=gemini, model="gemini-2.5-flash"),
# the SAME endpoint objects can serve context_model / reformulation /
# wiki_distill / graph_distill — declare a connection once, reuse it.
)
rag = CiteNexus.from_config(config)
Endpoints carry everything connection-shaped: key, custom headers, timeout,
pre/post hooks, auth style (AnthropicHttpEndpoint → Messages API automatically;
HttpEndpoint(auth_header="api-key", auth_scheme=None) for Azure-style).
Status
Early development, built layer-by-layer (foundation-first) and spec-driven via
OpenSpec. L0-L6 core retrieval/answering is implemented: the public client
exposes ingest(), retrieve(), ask(), stream(), memory recall, and
evaluate(csv), with graph and wiki navigation resolving back to citable EUs.
MCP and external auth enforcement are still later work. See CLAUDE.md
for the build plan and conventions, and docs/SPEC-v6.md for
the full specification.
Develop
task setup # uv sync
task check # lint + typecheck + unit tests (the CI gate)
task test # hermetic unit suite (fakes only)
task local:example # end-to-end demo: ingest → ask → evaluate (hosted stack, no infra)
Unit tests are hermetic (fakes only) and need nothing running.
The example (example/)
task local:example runs ingest → ask → evaluate over a tiny multilingual
corpus using a cheap, hosted, no-infra stack:
- Storage — LocalFs (a folder). Point
CITENEXUS_S3_ENDPOINT_URLat MinIO or Cloudflare R2 to exercise the real S3 path. - Embedding + reranker — Jina (
/v1/embeddings+/rerank, one key). - Answering LLM — Gemini's OpenAI-compatible endpoint (temperature 0).
Secrets live in a vsync vault
(infra/vault/dev/.env.dev, encrypted on S3), referenced in code by env-var
name only. task local:example loads it via dotenv. Copy
example/.env.example if you'd rather use a plain file.
Heavier all-local paths stay opt-in: task local:minio:up (real S3 backend),
task local:models:up (infinity — bge-m3 embed + bge-reranker on one port), and
task local:ollama:up (a local answering LLM). See compose.yaml.
License
Apache-2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file citenexus-0.2.0.tar.gz.
File metadata
- Download URL: citenexus-0.2.0.tar.gz
- Upload date:
- Size: 489.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.26 {"installer":{"name":"uv","version":"0.11.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c1bec75735a96ed6eec5d60244e0e872deed421d561a0096c2d20edea8815731
|
|
| MD5 |
2f6610d0802bbb4ade76ba97d776a605
|
|
| BLAKE2b-256 |
7a842cf1ac84774b43aa9d73b4a874e79fe51cb5d86cad944d44f8429ae27fb5
|
File details
Details for the file citenexus-0.2.0-py3-none-any.whl.
File metadata
- Download URL: citenexus-0.2.0-py3-none-any.whl
- Upload date:
- Size: 148.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.26 {"installer":{"name":"uv","version":"0.11.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0461b5ac207b4afc861c198ca5445597a2e7de0e753f12d9cfde33ed87c513d1
|
|
| MD5 |
fea9861c3310c571bb89f89376ecdab8
|
|
| BLAKE2b-256 |
e35b09dc652f6ac131306d939ef57ac5e2be2d6c6a232fd6573a93fcc6957e02
|