A modular, production-ready knowledge engine platform with clean architecture and multi-paradigm support (RAG, CLaRa).

These details have not been verified by PyPI

Project links

Project description

fitz-ai

Intelligent, honest RAG in 5 minutes. No infrastructure. No boilerplate.

Quick Start • Installation • Documentation • GitHub

pip install fitz-ai

fitz quickstart ./docs "What is our refund policy?"

That's it. Your documents are now searchable with AI.

fitz-ai quickstart demo

Python SDK → Full SDK Reference

import fitz_ai

fitz_ai.ingest("./docs")
answer = fitz_ai.query("What is our refund policy?")

REST API → Full API Reference

pip install fitz-ai[api]

fitz serve  # http://localhost:8000/docs for interactive API

About 🧑‍🌾

Solo project by Yan Fitzner (LinkedIn, GitHub).

~50k lines of Python
1500+ tests, 99% coverage
Zero LangChain/LlamaIndex dependencies — built from scratch

fitz-ai honest_rag

📦 What is RAG?

RAG is how ChatGPT's "file search," Notion AI, and enterprise knowledge tools actually work under the hood. Instead of sending all your documents to an AI, RAG:

Indexes your documents once — Splits them into chunks, converts to vectors, stores in a database
Retrieves only what's relevant — When you ask a question, finds the 5-10 most relevant chunks
Sends just those chunks to the LLM — The AI answers based on focused, relevant context

Traditional approach:

  [All 10,000 documents] → LLM → Answer
  ❌ Impossible (too large)
  ❌ Expensive (if possible)
  ❌ Unfocused

RAG approach:

  Question → [Search index] → [5 relevant chunks] → LLM → Answer
  ✅ Works at any scale
  ✅ Costs pennies per query
  ✅ Focused context = better answers

📦 Why Can't I Just Send My Documents to ChatGPT directly?

You can—but you'll hit walls fast.

Context window limits 🚨

GPT-4 accepts ~128k tokens. That's roughly 300 pages. Your company wiki, codebase, or document archive is likely 10x-100x larger. You physically cannot paste it all.

Cost explosion 💥

Even if you could fit everything, you'd pay for every token on every query. Sending 100k tokens costs ~$1-3 per question. Ask 50 questions a day? That's $50-150 daily—for one user.

No selective retrieval ❌

When you paste documents, the model reads everything equally. It can't focus on what's relevant. Ask about refund policies and it's also processing your hiring guidelines, engineering specs, and meeting notes—wasting context and degrading answers.

No persistence 💢

Every conversation starts fresh. You re-upload, re-paste, re-explain. There's no knowledge base that accumulates and improves.

Why Fitz?

Super fast setup 🐆

Point at a folder. Ask a question. Get an answer with sources. Even for tables! Everything else is handled by Fitz.

Honest answers ✅ → Governance Benchmark

Most RAG tools confidently answer even when the answer isn't in your documents. Ask "What was our Q4 revenue?" when your docs only cover Q1-Q3, and typical RAG hallucinates a number. Fitz says: *"I cannot find Q4 revenue figures in the provided documents."

→ Fitz detects disputes at 89.7% recall on fitz-gov, a 1,100+ case benchmark for epistemic honesty.

Queries that actually work 📊

Standard RAG fails silently on real queries. Fitz has built-in intelligence: hierarchical summaries for "What are the trends?", exact keyword matching for "Find TC-1000", multi-query decomposition for complex questions, AST-aware chunking for code, and SQL execution for tabular data. No configuration—it just works.

Tabular data that is actually searchable 📈 → Unified Storage

CSV and table data is a nightmare in most RAG systems—chunked arbitrarily, structure lost, queries fail. Fitz stores tables natively in PostgreSQL alongside your vectors—same database, no sync issues. Auto-detects schema and runs real SQL. Ask "What's the average price by region?" and get an actual computed answer, not fragmented rows.

Other Features at a Glance 🃏

[x] Fully local execution possible. Embedded PostgreSQL + Ollama, no API keys required to start.

[x] Plugin-based architecture. Swap LLMs, rerankers, and retrieval pipelines via YAML config.

[x] Extensible engine system. FitzRAG built-in, with a clean registry for adding custom engines.

[X] Incremental ingestion. Only reprocesses changed files, even with new chunking settings.

[x] Full provenance. Every answer traces back to the exact chunk and document.

[x] Data privacy: No telemetry, no cloud, no external calls except to the LLM provider you configure.

[x] Enterprise gateway support. OAuth2 M2M, custom CA certs, mTLS, and corporate proxy/gateway integration.

[!TIP] Any questions left? Try fitz on itself:
fitz quickstart ./fitz_ai "How does the chunking pipeline work?"
The codebase speaks for itself.

Retrieval Intelligence

Most RAG implementations are naive vector search—they fail silently on real-world queries. Fitz has built-in intelligence that handles edge cases automatically:

Feature	Query	Naive RAG Problem	FitzRAG Solution
epistemic-honesty	"What was our Q4 revenue?"	❌ Hallucinated number — Info doesn't exist, but LLM won't admit it	✅ "I don't know"
governance-benchmarking	[Benchmark: fitz-gov]	❌ No measurement — Retrieval benchmarks don't test epistemic honesty	✅ 89.7% dispute detection, 81.2% abstain (ML classifier, 1100+ cases)
keyword-vocabulary	"Find TC_1000"	❌ Wrong test case — Embeddings see TC_1000 ≈ TC_2000 (semantically similar)	✅ Exact keyword matching
hybrid-search	"X100 battery specs"	❌ Returns Y200 docs — Semantic search misses exact model numbers	✅ Hybrid search (dense + sparse)
sparse-search	"error code E_AUTH_401"	❌ No exact match — Embeddings miss precise error codes	✅ PostgreSQL full-text search
multi-hop	"Who wrote the paper cited by the 2023 review?"	❌ Returns the review only — Single-step search can't traverse references	✅ Iterative retrieval
hierarchical-rag	"What are the design principles?"	❌ Random fragments — Answer is spread across docs; no single chunk contains it	✅ Hierarchical summaries
tabular-data-routing	"What's the timeout for CAN?" (table)	❌ Fragmented rows — Tables chunked arbitrarily, structure lost	✅ SQL on structured data
multi-query	[User pastes 500-char test report] "What failed and why?"	❌ Vaguely related chunks — Long input → averaged embedding → matches nothing specifically	✅ Multi-query decomposition
comparison-queries	"Compare React vs Vue performance"	❌ Incomplete comparison — Only retrieves one entity, missing the other	✅ Multi-entity retrieval
entity-graph	"What else mentions AuthService?"	❌ Isolated chunks — No awareness of shared entities across docs	✅ Entity-based chunk linking
temporal-queries	"What changed between Q1 and Q2?"	❌ Random chunks — No awareness of time periods in query	✅ Temporal query handling
aggregation-queries	"List all the test cases that failed"	❌ Partial list — No mechanism for comprehensive retrieval	✅ Aggregation query handling
freshness-authority	"What does the official spec say?"	❌ Returns notes — Can't distinguish authoritative vs informal sources	✅ Freshness/authority boosting
query-expansion	"How do I fetch the db config?"	❌ No matches — User says "fetch", docs say "retrieve"; "db" vs "database"	✅ Query expansion
query-rewriting	"Tell me more about it" (after discussing TechCorp)	❌ Lost context — Pronouns like "it" reference nothing, retrieval fails	✅ Conversational context resolution
hyde	"What's TechCorp's approach to sustainability?"	❌ Poor recall — Abstract queries don't embed close to concrete documents	✅ Hypothetical document generation
code-aware-chunking	"How does the auth module work?" (code)	❌ Broken code fragments — Naive chunking splits functions mid-body	✅ Complete functions
contextual-embeddings	"When does it expire?"	❌ Ambiguous chunk — "It expires in 24h" embedded without context; "it" = ?	✅ Summary-prefixed embeddings
reranking	"What's the battery warranty?"	❌ Imprecise ranking — Vector similarity ≠ true relevance; best answer buried	✅ Cross-encoder precision

[!IMPORTANT] These features are always on—no configuration needed. Fitz automatically detects when to use each capability.

Governance — Know What You Don't Know

Feature docs • Benchmark results • Classifier experiments

Most RAG systems hallucinate confidently. Fitz measures and enforces epistemic honesty using a two-stage ML classifier trained on 1,100+ labeled cases from fitz-gov, a benchmark for epistemic honesty.

  Query + Retrieved Chunks
            │
            ▼
  ┌─────────────────────┐
  │ 5 Constraints       │     Contradiction detection, evidence sufficiency,
  │ (epistemic sensors) │     causal attribution, answer verification, specific info type
  └──────────┬──────────┘
             │ 51 features extracted
             ▼
  ┌─────────────────────┐
  │ Stage 1: RF         │     Can the evidence answer this query?
  │ Answerability       ├───► NO ──► ABSTAIN
  └──────────┬──────────┘
             │ YES
             ▼
  ┌─────────────────────┐     Do the sources conflict?
  │ Stage 2: ET         ├───► YES ──► DISPUTED
  │ Conflict Detection  │
  └──────────┬──────────┘
             │                Consistent evidence found
             └──────────────► NO ──► TRUSTWORTHY

Decision	Meaning	Recall
ABSTAIN	Evidence doesn't answer the question	81.2%
DISPUTED	Sources contradict each other	89.7%
TRUSTWORTHY	Consistent, sufficient evidence	70.6%

[!NOTE] Governance asks "given three relevant documents that partially contradict each other, should you flag a dispute, hedge the answer, or trust the consensus?" That's a judgment call even humans disagree on. 92% of our test cases are rated "hard."

The system fails safe 🛡️

The safety-first threshold is tuned so that when the classifier is wrong, it over-hedges ("disputed" instead of "trustworthy") — annoying but harmless. Over-confidence ("trustworthy" instead of "disputed") is the rarest error mode: only 3 cases in 1,100+.

These scores are a floor, not a ceiling 👣

All benchmarks were measured using qwen2.5:3b — a 3B parameter local model. The governance constraints run on the fast-tier LLM to keep latency low. Stronger models produce better constraint signals, which feed better features into the classifier. Upgrading your chat provider should improve governance accuracy for free.

Zero extra latency ⏱️

The constraints already run as part of the pipeline. The ML classifier just replaces hand-coded rules with a local sklearn model — inference takes microseconds, no additional API calls.

📦 Plugin Generator → Plugin Development Guide

Generate plugins with AI 🤖

Fitz can generate fully working plugins from natural language descriptions. Describe what you want, and fitz creates, validates, and saves the plugin automatically.
fitz plugin
? Plugin type: chunker
? Description: sentence-based chunker that splits on periods

Generating...
✓ Syntax valid
✓ Schema valid
✓ Plugin loads correctly
✓ Functional test passed

Created: ~/.fitz/plugins/chunking/sentence_chunker.py
The generated plugin is immediately usable—no manual editing required.

Supported plugin types

Type Format Description

llm-chat YAML Connect to a chat LLM provider

llm-embedding YAML Connect to an embedding provider

llm-rerank YAML Connect to a reranking provider

retrieval YAML Define a retrieval strategy

chunker Python Custom document chunking logic

reader Python Custom file format reader

constraint Python Epistemic safety guardrail

Type	Format	Description
`llm-chat`	YAML	Connect to a chat LLM provider
`llm-embedding`	YAML	Connect to an embedding provider
`llm-rerank`	YAML	Connect to a reranking provider
`retrieval`	YAML	Define a retrieval strategy
`chunker`	Python	Custom document chunking logic
`reader`	Python	Custom file format reader
`constraint`	Python	Epistemic safety guardrail

How it works

Prompt building: Fitz loads existing plugin examples and schema definitions

Generation: Your configured LLM generates the plugin code

Multi-level validation: Syntax → Schema → Integration → Functional tests

Auto-retry: If validation fails, fitz feeds the error back and retries (up to 3 attempts)

Save: Working plugins are saved to ~/.fitz/plugins/

Generated plugins are auto-discovered by fitz on next run—no registration needed.

Example: Custom chunker

fitz plugin
? Plugin type: chunker
? Description: splits text by paragraphs, keeping code blocks intact

# Creates ~/.fitz/plugins/chunking/paragraph_chunker.py

# Generated plugin is immediately usable
fitz ingest ./docs --chunker paragraph_chunker

📦 Quick Start

CLI

pip install fitz-ai

fitz quickstart ./docs "Your question here"
Fitz auto-detects your LLM provider:

Ollama running? → Uses it automatically (fully local)

COHERE_API_KEY or OPENAI_API_KEY set? → Uses it automatically

First time? → Guides you through free Cohere signup (2 minutes)

After first run, it's completely zero-friction.

Python SDK

import fitz_ai

fitz_ai.ingest("./docs")
answer = fitz_ai.query("Your question here")

print(answer.text)
for source in answer.provenance:
   print(f"  - {source.source_id}: {source.excerpt[:50]}...")
The SDK provides:

Module-level functions matching CLI (ingest, query)

Auto-config creation (no setup required)

Full provenance tracking

Same honest RAG as the CLI

For advanced use (multiple collections), use the fitz class directly:
from fitz_ai import fitz

physics = fitz(collection="physics")
physics.ingest("./physics_papers")
answer = physics.query("Explain entanglement")

Fully Local (Ollama)

pip install fitz-ai[local]

ollama pull llama3.2
ollama pull nomic-embed-text

fitz quickstart ./docs "Your question here"
Fitz auto-detects Ollama when running. No API keys needed—no data leaves your machine.

📦 Real-World Usage

Fitz is a foundation. It handles document ingestion and grounded retrieval—you build whatever sits on top: chatbots, dashboards, alerts, or automation.

Chatbot Backend 🤖

Connect fitz to Slack, Discord, Teams, or your own UI. One function call returns an answer with sources—no hallucinations, full provenance. You handle the conversation flow; fitz handles the knowledge.

Example: A SaaS company plugs fitz into their support bot. Tier-1 questions like "How do I reset my password?" get instant answers. Their support team focuses on edge cases while fitz deflects 60% of incoming tickets.

Internal Knowledge Base 📖

Point fitz at your company's wiki, policies, and runbooks. Employees ask natural language questions instead of hunting through folders or pinging colleagues on Slack.

Example: A 200-person startup ingests their Notion workspace and compliance docs. New hires find answers to "How do I request PTO?" on day one—no more waiting for someone in HR to respond.

Continuous Intelligence & Alerting (Watchdog) 🐶

Pair fitz with cron, Airflow, or Lambda. Ingest data on a schedule, run queries automatically, trigger alerts when conditions match. Fitz provides the retrieval primitive; you wire the automation.

Example: A security team ingests SIEM logs nightly. Every morning, a scheduled job asks "Were there failed logins from unusual locations?" If fitz finds evidence, an alert fires to the on-call channel before anyone checks email.

Web Knowledge Base 🌎

Scrape the web with Scrapy, BeautifulSoup, or Playwright. Save to disk, ingest with fitz. The web becomes a queryable knowledge base.

Example: A football analytics hobbyist scrapes Premier League match reports. After ingesting, they ask "How did Arsenal perform against top 6 teams?" or "What tactics did Liverpool use in away games?"—insights that would take hours to compile manually.

Codebase Search 🐍

Fitz includes built-in AST-aware chunking for code bases. Functions, classes, and modules become individual searchable units with docstrings and imports preserved. Ask questions in natural language; get answers pointing to specific code.

Example: A team inherits a legacy Django monolith—200k lines, sparse docs. They ingest the codebase and ask "Where is user authentication handled?" or "What API endpoints modify the billing table?" New developers onboard in days instead of weeks.

📦 Architecture → Full Architecture Guide

┌───────────────────────────────────────────────────────────────┐
│                         fitz-ai                               │
├───────────────────────────────────────────────────────────────┤
│  User Interfaces                                              │
│  CLI: quickstart | init | ingest | query | chat | serve       │
│  SDK: fitz_ai.fitz() → ingest() → ask()                       │
│  API: /query | /chat | /ingest | /collections | /health       │
├───────────────────────────────────────────────────────────────┤
│  Engines                                                      │
│  ┌───────────┐  ┌────────────┐                                │
│  │  FitzRAG  │  │  Custom... │  (extensible registry)         │
│  └───────────┘  └────────────┘                                │
├───────────────────────────────────────────────────────────────┤
│  LLM Plugins (YAML-defined)                                   │
│  ┌────────┐ ┌───────────┐ ┌────────┐                          │
│  │  Chat  │ │ Embedding │ │ Rerank │                          │
│  └────────┘ └───────────┘ └────────┘                          │
│  openai, cohere, anthropic, ollama, azure...                  │
├───────────────────────────────────────────────────────────────┤
│  Storage (PostgreSQL + pgvector)                              │
│  vectors | metadata | tables | keywords | full-text search    │
├───────────────────────────────────────────────────────────────┤
│  Retrieval Pipelines (plugin choice controls features)        │
│  dense (no rerank) | dense_rerank (with rerank)               │
├───────────────────────────────────────────────────────────────┤
│  Enrichment (baked in via ChunkEnricher)                      │
│  summaries | keywords | entities | hierarchical summaries     │
├───────────────────────────────────────────────────────────────┤
│  Constraints (epistemic safety)                               │
│  ConflictAware | InsufficientEvidence | CausalAttribution     │
└───────────────────────────────────────────────────────────────┘

📦 CLI Reference → Full CLI Guide

fitz quickstart [PATH] [QUESTION]    # Zero-config RAG (start here)
fitz init                            # Interactive setup wizard
fitz ingest                          # Interactive ingestion
fitz query                           # Single question with sources
fitz chat                            # Multi-turn conversation with your knowledge base
fitz collections                     # List and delete knowledge collections
fitz keywords                        # Manage keyword vocabulary for exact matching
fitz plugin                          # Generate plugins with AI
fitz serve                           # Start REST API server
fitz config                          # View/edit configuration
fitz doctor                          # System diagnostics

📦 Python SDK Reference → Full SDK Guide

Simple usage (module-level, matches CLI):

import fitz_ai

fitz_ai.ingest("./docs")
answer = fitz_ai.query("What is the refund policy?")
print(answer.text)

Advanced usage (multiple collections):

from fitz_ai import fitz

# Create separate instances for different collections
physics = fitz(collection="physics")
physics.ingest("./physics_papers")

legal = fitz(collection="legal")
legal.ingest("./contracts")

# Query each collection
physics_answer = physics.query("Explain entanglement")
legal_answer = legal.query("What are the payment terms?")

Working with answers:

answer = fitz_ai.query("What is the refund policy?")

print(answer.text)
print(answer.mode)  # CONFIDENT, QUALIFIED, DISPUTED, or ABSTAIN

for source in answer.provenance:
    print(f"Source: {source.source_id}")
    print(f"Excerpt: {source.excerpt}")

📦 REST API Reference → Full API Guide

Start the server:

pip install fitz-ai[api]

fitz serve                    # localhost:8000
fitz serve -p 3000            # custom port
fitz serve --host 0.0.0.0     # all interfaces

Interactive docs: Visit http://localhost:8000/docs for Swagger UI.

Endpoints:

Method	Endpoint	Description
POST	`/query`	Query knowledge base
POST	`/chat`	Multi-turn chat (stateless)
POST	`/ingest`	Ingest documents from path
GET	`/collections`	List all collections
GET	`/collections/{name}`	Get collection stats
DELETE	`/collections/{name}`	Delete a collection
GET	`/health`	Health check

Example requests:

# Query
curl -X POST http://localhost:8000/query \
  -H "Content-Type: application/json" \
  -d '{"question": "What is the refund policy?", "collection": "default"}'

# Ingest
curl -X POST http://localhost:8000/ingest \
  -H "Content-Type: application/json" \
  -d '{"source": "./docs", "collection": "mydata"}'

# Chat (stateless - client manages history)
curl -X POST http://localhost:8000/chat \
  -H "Content-Type: application/json" \
  -d '{
    "message": "What about returns?",
    "history": [
      {"role": "user", "content": "What is the refund policy?"},
      {"role": "assistant", "content": "The refund policy allows..."}
    ],
    "collection": "default"
  }'

License

MIT

Links

Documentation:

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.11.0

Mar 21, 2026

0.10.4

Mar 19, 2026

0.10.3

Mar 19, 2026

0.10.2

Mar 18, 2026

0.10.1

Feb 28, 2026

0.10.0

Feb 17, 2026

This version

0.8.1

Feb 10, 2026

0.8.0

Feb 3, 2026

0.7.0

Jan 26, 2026

0.6.2

Jan 24, 2026

0.6.1

Jan 23, 2026

0.6.0

Jan 21, 2026

0.5.2

Jan 15, 2026

0.5.1

Jan 11, 2026

0.5.0

Jan 7, 2026

0.4.5

Jan 4, 2026

0.4.4

Dec 30, 2025

0.4.3

Dec 29, 2025

0.4.2

Dec 28, 2025

0.4.1

Dec 26, 2025

0.4.0

Dec 26, 2025

0.3.5

Dec 21, 2025

0.3.4

Dec 19, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fitz_ai-0.8.1.tar.gz (514.9 kB view details)

Uploaded Feb 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fitz_ai-0.8.1-py3-none-any.whl (674.6 kB view details)

Uploaded Feb 10, 2026 Python 3

File details

Details for the file fitz_ai-0.8.1.tar.gz.

File metadata

Download URL: fitz_ai-0.8.1.tar.gz
Upload date: Feb 10, 2026
Size: 514.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for fitz_ai-0.8.1.tar.gz
Algorithm	Hash digest
SHA256	`3a960f966be3994e0fca0ed5241688384319c761632c7302d3dd2d8c9a31b28b`
MD5	`83d7bc8c8446cd1a700a74233d9effa8`
BLAKE2b-256	`fd576e942bed0108c9f4bea05330f7672534d4f014b76d31fc10bf218dcb0272`

See more details on using hashes here.

File details

Details for the file fitz_ai-0.8.1-py3-none-any.whl.

File metadata

Download URL: fitz_ai-0.8.1-py3-none-any.whl
Upload date: Feb 10, 2026
Size: 674.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for fitz_ai-0.8.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`bb456f7c093de14e1f4427542f78bd7050e27d285a9e2ce3cb4bad80f45ee5d2`
MD5	`f39119114c1f4323c55f7ead595cb3cc`
BLAKE2b-256	`37e48b548f430cee92f14d92ac8738df730555802f0b3f00f6a74020fd622261`

See more details on using hashes here.

fitz-ai 0.8.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

fitz-ai

Intelligent, honest RAG in 5 minutes. No infrastructure. No boilerplate.

About 🧑‍🌾

Why Fitz?

Retrieval Intelligence

Governance — Know What You Don't Know

Generate plugins with AI 🤖

Supported plugin types

How it works

Example: Custom chunker

CLI

Python SDK

Fully Local (Ollama)

License

Links

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes