A modular, production-ready knowledge engine platform with clean architecture and multi-paradigm support (RAG, CLaRa).
Project description
fitz-ai
Intelligent, honest RAG in 5 minutes. No infrastructure. No boilerplate.
pip install fitz-ai
fitz quickstart ./docs "What is our refund policy?"
That's it. Your documents are now searchable with AI.
Python SDK โ Full SDK Reference
import fitz_ai
fitz_ai.ingest("./docs")
answer = fitz_ai.query("What is our refund policy?")
REST API โ Full API Reference
pip install fitz-ai[api]
fitz serve # http://localhost:8000/docs for interactive API
About ๐งโ๐พ
Solo project by Yan Fitzner (LinkedIn, GitHub).
- ~75k lines of Python
- 1200+ tests, 100% coverage
- Zero LangChain/LlamaIndex dependencies โ built from scratch
๐ฆ What is RAG?
RAG is how ChatGPT's "file search," Notion AI, and enterprise knowledge tools actually work under the hood. Instead of sending all your documents to an AI, RAG:
- Indexes your documents once โ Splits them into chunks, converts to vectors, stores in a database
- Retrieves only what's relevant โ When you ask a question, finds the 5-10 most relevant chunks
- Sends just those chunks to the LLM โ The AI answers based on focused, relevant context
Traditional approach:
[All 10,000 documents] โ LLM โ Answer
โ Impossible (too large)
โ Expensive (if possible)
โ Unfocused
RAG approach:
Question โ [Search index] โ [5 relevant chunks] โ LLM โ Answer
โ
Works at any scale
โ
Costs pennies per query
โ
Focused context = better answers
๐ฆ Why Can't I Just Send My Documents to ChatGPT directly?
You canโbut you'll hit walls fast.
Context window limits ๐จ
GPT-4 accepts ~128k tokens. That's roughly 300 pages. Your company wiki, codebase, or document archive is likely 10x-100x larger. You physically cannot paste it all.
Cost explosion ๐ฅ
Even if you could fit everything, you'd pay for every token on every query. Sending 100k tokens costs ~$1-3 per question. Ask 50 questions a day? That's $50-150 dailyโfor one user.
No selective retrieval โ
When you paste documents, the model reads everything equally. It can't focus on what's relevant. Ask about refund policies and it's also processing your hiring guidelines, engineering specs, and meeting notesโwasting context and degrading answers.
No persistence ๐ข
Every conversation starts fresh. You re-upload, re-paste, re-explain. There's no knowledge base that accumulates and improves.
Why Fitz?
Super fast setup ๐
Point at a folder. Ask a question. Get an answer with sources. Everything else is handled by Fitz.
Honest answers โ
Most RAG tools confidently answer even when the answer isn't in your documents. Ask "What was our Q4 revenue?" when your docs only cover Q1-Q3, and typical RAG hallucinates a number. Fitz says: "I cannot find Q4 revenue figures in the provided documents."
Swap engines, keep everything else โ๏ธ
RAG is evolving fastโHyDE, ColBERT, agentic RAG, whatever's next. Fitz lets you switch engines in one line. Your ingested data stays. Your queries stay. No migration, no re-ingestion, no new API to learn. Frameworks lock you in; Fitz lets you move.
Queries that actually work ๐
Standard RAG fails silently on real queries. Fitz has built-in intelligence: hierarchical summaries for "What are the trends?", exact keyword matching for "Find TC-1001", multi-query decomposition for complex questions, AST-aware chunking for code, and SQL execution for tabular data. No configurationโit just works.
Other Features at a Glance ๐
- [x] Local execution possible. FAISS and Ollama support, no API keys required to start.
- [x] Plugin-based architecture. Swap LLMs, vector databases, rerankers, and retrieval pipelines via YAML config.
- [x] Extensible engine system. FitzRAG built-in, with a clean registry for adding custom engines.
- [X] Incremental ingestion. Only reprocesses changed files, even with new chunking settings.
- [x] Full provenance. Every answer traces back to the exact chunk and document.
- [x] Data privacy: No telemetry, no cloud, no external calls except to the LLM provider you configure.
[!TIP] Any questions left? Try fitz on itself:
fitz quickstart ./fitz_ai "How does the chunking pipeline work?"The codebase speaks for itself.
Retrieval Intelligence
Most RAG implementations are naive vector searchโthey fail silently on real-world queries. Fitz has built-in intelligence that handles edge cases automatically:
| Feature | Query | Naive RAG Problem | FitzRAG Solution |
|---|---|---|---|
| epistemic-honesty | "What was our Q4 revenue?" | โ Hallucinated number โ Info doesn't exist, but LLM won't admit it | โ "I don't know" |
| keyword-vocabulary | "Find TC_1000" | โ Wrong test case โ Embeddings see TC_1000 โ TC_2000 (semantically similar) | โ Exact keyword matching |
| hybrid-search | "X100 battery specs" | โ Returns Y200 docs โ Semantic search misses exact model numbers | โ Hybrid search (dense + sparse) |
| multi-hop | "Who wrote the paper cited by the 2023 review?" | โ Returns the review only โ Single-step search can't traverse references | โ Iterative retrieval |
| hierarchical-rag | "What are the design principles?" | โ Random fragments โ Answer is spread across docs; no single chunk contains it | โ Hierarchical summaries |
| tabular-data-routing | "What's the timeout for CAN?" (table) | โ Fragmented rows โ Tables chunked arbitrarily, structure lost | โ SQL on structured data |
| multi-query | [User pastes 500-char test report] "What failed and why?" | โ Vaguely related chunks โ Long input โ averaged embedding โ matches nothing specifically | โ Multi-query decomposition |
| comparison-queries | "Compare React vs Vue performance" | โ Incomplete comparison โ Only retrieves one entity, missing the other | โ Multi-entity retrieval |
| temporal-queries | "What changed between Q1 and Q2?" | โ Random chunks โ No awareness of time periods in query | โ Temporal query handling |
| aggregation-queries | "List all the test cases that failed" | โ Partial list โ No mechanism for comprehensive retrieval | โ Aggregation query handling |
| freshness-authority | "What does the official spec say?" | โ Returns notes โ Can't distinguish authoritative vs informal sources | โ Freshness/authority boosting |
| query-expansion | "How do I fetch the db config?" | โ No matches โ User says "fetch", docs say "retrieve"; "db" vs "database" | โ Query expansion |
| query-rewriting | "Tell me more about it" (after discussing TechCorp) | โ Lost context โ Pronouns like "it" reference nothing, retrieval fails | โ Conversational context resolution |
| hyde | "What's TechCorp's approach to sustainability?" | โ Poor recall โ Abstract queries don't embed close to concrete documents | โ Hypothetical document generation |
| code-aware-chunking | "How does the auth module work?" (code) | โ Broken code fragments โ Naive chunking splits functions mid-body | โ Complete functions |
| contextual-embeddings | "When does it expire?" | โ Ambiguous chunk โ "It expires in 24h" embedded without context; "it" = ? | โ Summary-prefixed embeddings |
[!TIP] These features are always onโno configuration needed. Fitz automatically detects when to use each capability.
๐ฆ Fitz vs LangChain vs LlamaIndex
Fitz opts for a deliberately narrower approach.
LangChain and LlamaIndex are powerful LLM application frameworks designed to help developers build complex, end-to-end AI systems. Fitz provides a minimal, replaceable RAG engine with strong epistemic guarantees โ without locking users into a framework, ecosystem, or long-term architectural commitment.
Fitz is not a competitor in scope.
It is an infrastructure primitive.
Core philosophical differences โ๏ธ
Dimension Fitz LangChain LlamaIndex Primary role RAG engine LLM application framework LLM data framework User commitment No framework lock-in High High Engine coupling Swappable in one line Deep Deep Design goal Correctness & honesty Flexibility Data integration Long-term risk Low Migration-heavy Migration-heavy
Epistemic behavior (truth over fluency) ๐ฏ
Aspect Fitz LangChain / LlamaIndex โI donโt knowโ First-class behavior Not guaranteed Hallucination handling Designed-in Usually prompt-level Confidence signaling Explicit Implicit Fitz treats uncertainty as a feature, not a failure.
If the system cannot support an answer with retrieved evidence, it says so.
Transparency & provenance ๐
Capability Fitz LangChain / LlamaIndex Source attribution Mandatory Optional Retrieval trace Explicit & structured Often opaque Debuggability Built-in Tool-dependent Every answer in Fitz is fully auditable down to the retrieval step.
Scope & complexity ๐ช
Aspect Fitz LangChain / LlamaIndex Chains / agents โ โ Prompt graphs โ โ UI abstractions โ Often Cognitive overhead Very low High Fitz intentionally does less โ so it can be trusted more.
Use Fitz if you want:
- A replaceable RAG engine, not a framework marriage
- Strong epistemic guarantees (โI donโt knowโ is valid output)
- Full provenance for every answer
- A transparent, extensible plugin architecture
- A future-proof ingestion pipeline that survives engine changes
๐ฆ Features
Swappable RAG Engines ๐
Your data stays. Your queries stay. Only the engine changes.
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ Your Query โ โ "What are the payment terms?" โ โโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโ โ โผ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ engine="..." โ โ โโโโโโโโโโโ โโโโโโโโโโโ โ โ โ fitz โ โ custom โ ... โ โ โ _rag โ โ engine โ โ โ โโโโโโฌโโโโโ โโโโโโฌโโโโโ โ โ โโโโโโโโโโโโโ โ โโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโ โ โผ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ Your Ingested Knowledge โ โ (unchanged across engines) โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโanswer = run("What are the payment terms?", engine="fitz_rag") answer = run("What are the payment terms?", engine="custom") # your engineNo migration. No re-ingestion. No new API to learn.
Full Provenance ๐๏ธ
Every answer traces back to its source:
Answer: The refund policy allows returns within 30 days... Sources: [1] policies/refund.md [chunk 3] (score: 0.92) [2] faq/payments.md [chunk 1] (score: 0.87)
Incremental Ingestion โก โ Ingestion Guide
Fitz tracks file hashes and only re-ingests what changed:
$ fitz ingest ./src Scanning... 847 files โ 12 new files โ 3 modified files โ 832 unchanged (skipped) Ingesting 15 files...Re-running ingestion on a large codebase takes seconds, not minutes. Changed your chunking config? Fitz detects that too and re-processes affected files.
๐ฆ Plugin Generator โ Plugin Development Guide
Generate plugins with AI ๐ค
Fitz can generate fully working plugins from natural language descriptions. Describe what you want, and fitz creates, validates, and saves the plugin automatically.
fitz plugin ? Plugin type: chunker ? Description: sentence-based chunker that splits on periods Generating... โ Syntax valid โ Schema valid โ Plugin loads correctly โ Functional test passed Created: ~/.fitz/plugins/chunking/sentence_chunker.pyThe generated plugin is immediately usableโno manual editing required.
Supported plugin types
Type Format Description llm-chatYAML Connect to a chat LLM provider llm-embeddingYAML Connect to an embedding provider llm-rerankYAML Connect to a reranking provider vector-dbYAML Connect to a vector database retrievalYAML Define a retrieval strategy chunkerPython Custom document chunking logic readerPython Custom file format reader constraintPython Epistemic safety guardrail
How it works
- Prompt building: Fitz loads existing plugin examples and schema definitions
- Generation: Your configured LLM generates the plugin code
- Multi-level validation: Syntax โ Schema โ Integration โ Functional tests
- Auto-retry: If validation fails, fitz feeds the error back and retries (up to 3 attempts)
- Save: Working plugins are saved to
~/.fitz/plugins/Generated plugins are auto-discovered by fitz on next runโno registration needed.
Example: Custom chunker
fitz plugin ? Plugin type: chunker ? Description: splits text by paragraphs, keeping code blocks intact # Creates ~/.fitz/plugins/chunking/paragraph_chunker.py# Generated plugin is immediately usable fitz ingest ./docs --chunker paragraph_chunker
๐ฆ Quick Start
CLI
pip install fitz-ai fitz quickstart ./docs "Your question here"Fitz auto-detects your LLM provider:
- Ollama running? โ Uses it automatically (fully local)
COHERE_API_KEYorOPENAI_API_KEYset? โ Uses it automatically- First time? โ Guides you through free Cohere signup (2 minutes)
After first run, it's completely zero-friction.
Python SDK
import fitz_ai fitz_ai.ingest("./docs") answer = fitz_ai.query("Your question here") print(answer.text) for source in answer.provenance: print(f" - {source.source_id}: {source.excerpt[:50]}...")The SDK provides:
- Module-level functions matching CLI (
ingest,query)- Auto-config creation (no setup required)
- Full provenance tracking
- Same honest RAG as the CLI
For advanced use (multiple collections), use the
fitzclass directly:from fitz_ai import fitz physics = fitz(collection="physics") physics.ingest("./physics_papers") answer = physics.query("Explain entanglement")
Fully Local (Ollama)
pip install fitz-ai[local] ollama pull llama3.2 ollama pull nomic-embed-text fitz quickstart ./docs "Your question here"Fitz auto-detects Ollama when running. No API keys neededโno data leaves your machine.
๐ฆ Real-World Usage
Fitz is a foundation. It handles document ingestion and grounded retrievalโyou build whatever sits on top: chatbots, dashboards, alerts, or automation.
Chatbot Backend ๐ค
Connect fitz to Slack, Discord, Teams, or your own UI. One function call returns an answer with sourcesโno hallucinations, full provenance. You handle the conversation flow; fitz handles the knowledge.
Example: A SaaS company plugs fitz into their support bot. Tier-1 questions like "How do I reset my password?" get instant answers. Their support team focuses on edge cases while fitz deflects 60% of incoming tickets.
Internal Knowledge Base ๐
Point fitz at your company's wiki, policies, and runbooks. Employees ask natural language questions instead of hunting through folders or pinging colleagues on Slack.
Example: A 200-person startup ingests their Notion workspace and compliance docs. New hires find answers to "How do I request PTO?" on day oneโno more waiting for someone in HR to respond.
Continuous Intelligence & Alerting (Watchdog) ๐ถ
Pair fitz with cron, Airflow, or Lambda. Ingest data on a schedule, run queries automatically, trigger alerts when conditions match. Fitz provides the retrieval primitive; you wire the automation.
Example: A security team ingests SIEM logs nightly. Every morning, a scheduled job asks "Were there failed logins from unusual locations?" If fitz finds evidence, an alert fires to the on-call channel before anyone checks email.
Web Knowledge Base ๐
Scrape the web with Scrapy, BeautifulSoup, or Playwright. Save to disk, ingest with fitz. The web becomes a queryable knowledge base.
Example: A football analytics hobbyist scrapes Premier League match reports. After ingesting, they ask "How did Arsenal perform against top 6 teams?" or "What tactics did Liverpool use in away games?"โinsights that would take hours to compile manually.
Codebase Search ๐
Fitz includes built-in AST-aware chunking for code bases. Functions, classes, and modules become individual searchable units with docstrings and imports preserved. Ask questions in natural language; get answers pointing to specific code.
Example: A team inherits a legacy Django monolithโ200k lines, sparse docs. They ingest the codebase and ask "Where is user authentication handled?" or "What API endpoints modify the billing table?" New developers onboard in days instead of weeks.
๐ฆ Architecture โ Full Architecture Guide
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ fitz-ai โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ User Interfaces โ
โ CLI: quickstart | init | ingest | query | chat | serve โ
โ SDK: fitz_ai.fitz() โ ingest() โ ask() โ
โ API: /query | /chat | /ingest | /collections | /health โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Engines โ
โ โโโโโโโโโโโโโ โโโโโโโโโโโโโโ โ
โ โ FitzRAG โ โ Custom... โ (extensible registry) โ
โ โโโโโโโโโโโโโ โโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Plugin System (all YAML-defined) โ
โ โโโโโโโโโโ โโโโโโโโโโโโโ โโโโโโโโโโ โโโโโโโโโโโโ โ
โ โ Chat โ โ Embedding โ โ Rerank โ โ VectorDB โ โ
โ โโโโโโโโโโ โโโโโโโโโโโโโ โโโโโโโโโโ โโโโโโโโโโโโ โ
โ openai, cohere, anthropic, ollama, azure... โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Retrieval Pipelines (plugin choice controls features) โ
โ dense (no rerank) | dense_rerank (with rerank) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Enrichment (baked in via ChunkEnricher) โ
โ summaries | keywords | entities | hierarchical summaries โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Constraints (epistemic safety) โ
โ ConflictAware | InsufficientEvidence | CausalAttribution โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐ฆ CLI Reference โ Full CLI Guide
fitz quickstart [PATH] [QUESTION] # Zero-config RAG (start here)
fitz init # Interactive setup wizard
fitz ingest # Interactive ingestion
fitz query # Single question with sources
fitz chat # Multi-turn conversation with your knowledge base
fitz collections # List and delete knowledge collections
fitz keywords # Manage keyword vocabulary for exact matching
fitz plugin # Generate plugins with AI
fitz serve # Start REST API server
fitz config # View/edit configuration
fitz doctor # System diagnostics
๐ฆ Python SDK Reference โ Full SDK Guide
Simple usage (module-level, matches CLI):
import fitz_ai
fitz_ai.ingest("./docs")
answer = fitz_ai.query("What is the refund policy?")
print(answer.text)
Advanced usage (multiple collections):
from fitz_ai import fitz
# Create separate instances for different collections
physics = fitz(collection="physics")
physics.ingest("./physics_papers")
legal = fitz(collection="legal")
legal.ingest("./contracts")
# Query each collection
physics_answer = physics.query("Explain entanglement")
legal_answer = legal.query("What are the payment terms?")
Working with answers:
answer = fitz_ai.query("What is the refund policy?")
print(answer.text)
print(answer.mode) # CONFIDENT, QUALIFIED, DISPUTED, or ABSTAIN
for source in answer.provenance:
print(f"Source: {source.source_id}")
print(f"Excerpt: {source.excerpt}")
๐ฆ REST API Reference โ Full API Guide
Start the server:
pip install fitz-ai[api]
fitz serve # localhost:8000
fitz serve -p 3000 # custom port
fitz serve --host 0.0.0.0 # all interfaces
Interactive docs: Visit http://localhost:8000/docs for Swagger UI.
Endpoints:
| Method | Endpoint | Description |
|---|---|---|
| POST | /query |
Query knowledge base |
| POST | /chat |
Multi-turn chat (stateless) |
| POST | /ingest |
Ingest documents from path |
| GET | /collections |
List all collections |
| GET | /collections/{name} |
Get collection stats |
| DELETE | /collections/{name} |
Delete a collection |
| GET | /health |
Health check |
Example requests:
# Query
curl -X POST http://localhost:8000/query \
-H "Content-Type: application/json" \
-d '{"question": "What is the refund policy?", "collection": "default"}'
# Ingest
curl -X POST http://localhost:8000/ingest \
-H "Content-Type: application/json" \
-d '{"source": "./docs", "collection": "mydata"}'
# Chat (stateless - client manages history)
curl -X POST http://localhost:8000/chat \
-H "Content-Type: application/json" \
-d '{
"message": "What about returns?",
"history": [
{"role": "user", "content": "What is the refund policy?"},
{"role": "assistant", "content": "The refund policy allows..."}
],
"collection": "default"
}'
License
MIT
Links
Documentation:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fitz_ai-0.6.1.tar.gz.
File metadata
- Download URL: fitz_ai-0.6.1.tar.gz
- Upload date:
- Size: 449.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a9fb7f07bc7b49af572e444f0eed0e9a449c1cad7d88e89b2a3aadb9c1fb50fa
|
|
| MD5 |
a7f5428f405454adcb7a6755fc60f5e2
|
|
| BLAKE2b-256 |
6503ca87ad3d1cc79b8dc73f804454be7ca90454b9504f4fa06864e7506cbcc8
|
File details
Details for the file fitz_ai-0.6.1-py3-none-any.whl.
File metadata
- Download URL: fitz_ai-0.6.1-py3-none-any.whl
- Upload date:
- Size: 617.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3ec1b8fdbdca49222193ec0ba1c49e0c1c05cda40b1db53bd6b4c3d7fce5e956
|
|
| MD5 |
7b26d8660787d2cc5b432ec530b9cb5c
|
|
| BLAKE2b-256 |
eca4e627e3c4f76bcf913b745e8b5a89dd7598607af07dfcdf447ec3ce92432f
|