Skip to main content

A modular, production-ready knowledge engine platform with clean architecture and multi-paradigm support (RAG, CLaRa).

Project description

fitz-ai

Python 3.10+ PyPI version License: MIT Version Coverage


Intelligent, honest RAG in 5 minutes. No infrastructure. No boilerplate.

pip install fitz-ai

fitz quickstart ./docs "What is our refund policy?"

That's it. Your documents are now searchable with AI.

fitz-ai quickstart demo


Python SDK โ†’ Full SDK Reference
import fitz_ai

fitz_ai.ingest("./docs")
answer = fitz_ai.query("What is our refund policy?")

REST API โ†’ Full API Reference
pip install fitz-ai[api]

fitz serve  # http://localhost:8000/docs for interactive API

About ๐Ÿง‘โ€๐ŸŒพ

Solo project by Yan Fitzner (LinkedIn, GitHub).

  • ~65k lines of Python
  • 750+ tests, 100% coverage
  • Zero LangChain/LlamaIndex dependencies โ€” built from scratch

fitz-ai honest_rag


๐Ÿ“ฆ What is RAG?

RAG is how ChatGPT's "file search," Notion AI, and enterprise knowledge tools actually work under the hood. Instead of sending all your documents to an AI, RAG:

  1. Indexes your documents once โ€” Splits them into chunks, converts to vectors, stores in a database
  2. Retrieves only what's relevant โ€” When you ask a question, finds the 5-10 most relevant chunks
  3. Sends just those chunks to the LLM โ€” The AI answers based on focused, relevant context

Traditional approach:

  [All 10,000 documents] โ†’ LLM โ†’ Answer
  โŒ Impossible (too large)
  โŒ Expensive (if possible)
  โŒ Unfocused

RAG approach:

  Question โ†’ [Search index] โ†’ [5 relevant chunks] โ†’ LLM โ†’ Answer
  โœ… Works at any scale
  โœ… Costs pennies per query
  โœ… Focused context = better answers

๐Ÿ“ฆ Why Can't I Just Send My Documents to ChatGPT directly?

You canโ€”but you'll hit walls fast.

Context window limits ๐Ÿšจ

GPT-4 accepts ~128k tokens. That's roughly 300 pages. Your company wiki, codebase, or document archive is likely 10x-100x larger. You physically cannot paste it all.

Cost explosion ๐Ÿ’ฅ

Even if you could fit everything, you'd pay for every token on every query. Sending 100k tokens costs ~$1-3 per question. Ask 50 questions a day? That's $50-150 dailyโ€”for one user.

No selective retrieval โŒ

When you paste documents, the model reads everything equally. It can't focus on what's relevant. Ask about refund policies and it's also processing your hiring guidelines, engineering specs, and meeting notesโ€”wasting context and degrading answers.

No persistence ๐Ÿ’ข

Every conversation starts fresh. You re-upload, re-paste, re-explain. There's no knowledge base that accumulates and improves.


Why Fitz?

Super fast setup ๐Ÿ†

Point at a folder. Ask a question. Get an answer with sources. Everything else is handled by Fitz.

Honest answers โœ…

Most RAG tools confidently answer even when the answer isn't in your documents. Ask "What was our Q4 revenue?" when your docs only cover Q1-Q3, and typical RAG hallucinates a number. Fitz says: "I cannot find Q4 revenue figures in the provided documents."

Swap engines, keep everything else โš™๏ธ

RAG is evolving fastโ€”GraphRAG, HyDE, ColBERT, whatever's next. Fitz lets you switch engines in one line. Your ingested data stays. Your queries stay. No migration, no re-ingestion, no new API to learn. Frameworks lock you in; Fitz lets you move.

Queries that actually work ๐Ÿ“Š

Standard RAG fails silently on real queries. Fitz has built-in intelligence: hierarchical summaries for "What are the trends?", exact keyword matching for "Find TC-1001", multi-query decomposition for complex questions, AST-aware chunking for code, and SQL execution for tabular data. No configurationโ€”it just works.

Other Features at a Glance ๐Ÿƒ

  1. [x] Local execution possible. FAISS and Ollama support, no API keys required to start.
  2. [x] Plugin-based architecture. Swap LLMs, vector databases, rerankers, and retrieval pipelines via YAML config.
  3. [x] Multiple engines. Supports FitzRAG, GraphRAG and CLaRa out of the boxโ€”swap engines in one line.
  4. [X] Incremental ingestion. Only reprocesses changed files, even with new chunking settings.
  5. [x] Full provenance. Every answer traces back to the exact chunk and document.
  6. [x] Data privacy: No telemetry, no cloud, no external calls except to the LLM provider you configure.

Any questions left? Try fitz on itself:

fitz quickstart ./fitz_ai "How does the chunking pipeline work?"

The codebase speaks for itself.


Retrieval Intelligence

Most RAG implementations are naive vector searchโ€”they fail silently on real-world queries. Fitz has built-in intelligence that handles edge cases automatically:

Query Why Naive RAG Fails Result FitzRAG Solution
"What was our Q4 revenue?" Info doesn't exist, but LLM won't admit it โŒ Hallucinated number โœ… "I don't know"
"What are the design principles?" Answer is spread across docs; no single chunk contains it โŒ Random fragments โœ… Hierarchical summaries
"Find TC_1000" Embeddings see TC_1000 โ‰ˆ TC_2000 (semantically similar) โŒ Wrong test case โœ… Exact keyword matching
[User pastes 500-char test report] "What failed and why?" Long input โ†’ averaged embedding โ†’ matches nothing specifically โŒ Vaguely related chunks โœ… Multi-query decomposition
"How does the auth module work?" (code) Naive chunking splits functions mid-body โŒ Broken code fragments โœ… Complete functions
"What's the timeout for CAN?" (table) Tables chunked arbitrarily, structure lost โŒ Fragmented rows โœ… SQL on structured data

These features are always onโ€”no configuration needed. Fitz automatically detects when to use each capability.

Multi-Query Decomposition

The problem โ˜”๏ธ

Long, complex queries dilute into weak embeddings. Ask "Summarize the test failures, their root causes, and recommended fixes" and vector search returns chunks vaguely related to testsโ€”missing failures, causes, or fixes entirely.

The solution โ˜€๏ธ

Fitz automatically detects long queries (>300 chars) and decomposes them:

Original: "Summarize the test failures, their root causes, and recommended fixes"
    โ†“
Decomposed:
 โ†’ "test failures"
 โ†’ "root causes of failures"
 โ†’ "recommended fixes"
    โ†“
3 focused searches โ†’ deduplicated results โ†’ complete answer

Always on. Short queries run as single searches (no overhead). Long queries automatically expand. No configuration needed.

Keyword Vocabulary (Exact Match)

The problem โ˜”๏ธ

Semantic search struggles with identifiers. Ask "What happened with TC-1001?" and embeddings return TC-1002, TC-1003, or unrelated test casesโ€”because they're "semantically similar."

The solution โ˜€๏ธ

Fitz auto-detects identifiers during ingestion and builds a vocabulary:

  • Test cases: TC-1001, testcase_42
  • Tickets: JIRA-4521, BUG-789
  • Versions: v2.0.1, 1.0.0-beta
  • Code: AuthService, handle_login()

At query time, keywords pre-filter chunks before semantic search:

Q: "What happened with TC-1001?"
โ†’ Chunks filtered to only those containing TC-1001
โ†’ Semantic search runs on filtered set
โ†’ Result: Only TC-1001 content, never TC-1002

Variation matching handles format differences automatically:

TC-1001 โ†’ tc-1001, TC_1001, tc 1001
JIRA-123 โ†’ jira-123, JIRA123, jira 123
Hierarchical RAG

The problem โ˜”๏ธ

Standard RAG can't answer analytical queries. Ask "What are the trends?" and it returns random chunks instead of aggregated insights.

The solution โ˜€๏ธ

Fitz generates multi-level summaries during ingestion:

  • Level 0: Original chunks
  • Level 1: Group summaries (per source file)
  • Level 2: Corpus summary (all documents)
Q: "What are the overall trends?"
โ†’ Returns L2 corpus summary + L1 group summaries

Q: "What did users say about the async tutorial?"
โ†’ Returns L0 individual chunks from that file

Query routing is automaticโ€”summaries match analytical queries via embedding similarity.

Code-Aware Chunking

The problem โ˜”๏ธ

Naive chunking splits code mid-function, breaking syntax and losing context. A 50-line class becomes 3 fragments that don't make sense alone.

The solution โ˜€๏ธ

Fitz uses AST-aware chunking for code:

Language Strategy
Python Classes, functions, methods as units. Large classes split by method. Imports preserved.
Markdown Header-aware splits. Code blocks kept intact. YAML frontmatter extracted.
PDF Section detection (1.1, 2.3.1, roman numerals). Keywords like "Abstract", "Conclusion".
# Naive chunking:
def authenticate(user):     # โ† chunk 1 ends here
   if not user.token:      # โ† chunk 2 starts here (broken)
       raise AuthError()

# Fitz chunking:
def authenticate(user):     # โ† entire function = 1 chunk
   if not user.token:
       raise AuthError()
   return validate(user.token)

Docstrings, decorators, and type hints stay attached to their functions.

Tabular Data Routing

The problem โ˜”๏ธ

Tables in documents get chunked arbitrarilyโ€”rows split across chunks, headers separated from data. Semantic search fails on entity-specific queries like "How much does Alice earn?" because the embedding doesn't capture row-level data.

The solution โ˜€๏ธ

Fitz stores tables in SQLite and registers them for guaranteed retrieval:

Q: "How much does Alice earn?"
โ†’ Table chunk retrieved via registry (guaranteed, not semantic similarity)
โ†’ LLM generates SQL: SELECT salary FROM employees WHERE name = 'Alice'
โ†’ SQL executed on stored table data
โ†’ Result: "Alice earns $85,000"

How it works:

  • CSV files and embedded tables stored in SQLite TableStore
  • Schema chunks (columns + sample rows) indexed for search
  • Table registry ensures tables are always retrieved, even when semantic similarity is low
  • LLM generates SQL, executed on full table data

Always on. Tables are automatically detected and routed. No configuration needed.

Epistemic Honesty

The problem โ˜”๏ธ

Most RAG systems confidently answer even when the answer isn't in the documents. Ask "What was our Q4 revenue?" when docs only cover Q1-Q3, and they hallucinate a number.

The solution โ˜€๏ธ

Fitz has built-in epistemic guardrails that detect uncertainty:

Q: "What was our Q4 revenue?"
A: "I cannot find Q4 revenue figures in the provided documents.
   The available financial data covers Q1-Q3 only."

  Mode: ABSTAIN

Three constraint plugins run automatically:

Constraint What it catches
ConflictAware Sources disagree โ†’ surfaces the conflict
InsufficientEvidence No supporting evidence โ†’ refuses to guess
CausalAttribution Correlation โ‰  causation โ†’ blocks hallucinated "why"

Every answer includes a mode indicating confidence:

  • CONFIDENT โ€” Strong evidence supports the answer
  • QUALIFIED โ€” Answer given with noted limitations
  • DISPUTED โ€” Sources conflict, both views presented
  • ABSTAIN โ€” Insufficient evidence, refuses to answer
Roadmap
Feature Status Description
Hierarchical RAG โœ… Done Multi-level summaries for analytical queries
Keyword Vocabulary โœ… Done Exact matching for identifiers
Multi-Query Decomposition โœ… Done Automatic expansion for complex queries
Code-Aware Chunking โœ… Done AST-aware splitting for Python, Markdown, PDF
Epistemic Honesty โœ… Done "I don't know" when evidence is insufficient
Comparison Queries โœ… Done Multi-entity retrieval ("A vs B")
Tabular Data Routing โœ… Done SQL on structured table data
Multi-Table Joins โœ… Done JOIN queries across multiple tables
Multi-Hop Reasoning ๐Ÿ“‹ Planned Chain retrieval across related entities

๐Ÿ“ฆ Fitz vs LangChain vs LlamaIndex

Fitz opts for a deliberately narrower approach.

LangChain and LlamaIndex are powerful LLM application frameworks designed to help developers build complex, end-to-end AI systems. Fitz provides a minimal, replaceable RAG engine with strong epistemic guarantees โ€” without locking users into a framework, ecosystem, or long-term architectural commitment.

Fitz is not a competitor in scope.
It is an infrastructure primitive.


Core philosophical differences โš–๏ธ

Dimension Fitz LangChain LlamaIndex
Primary role RAG engine LLM application framework LLM data framework
User commitment No framework lock-in High High
Engine coupling Swappable in one line Deep Deep
Design goal Correctness & honesty Flexibility Data integration
Long-term risk Low Migration-heavy Migration-heavy

Epistemic behavior (truth over fluency) ๐ŸŽฏ

Aspect Fitz LangChain / LlamaIndex
โ€œI donโ€™t knowโ€ First-class behavior Not guaranteed
Hallucination handling Designed-in Usually prompt-level
Confidence signaling Explicit Implicit

Fitz treats uncertainty as a feature, not a failure.
If the system cannot support an answer with retrieved evidence, it says so.


Transparency & provenance ๐Ÿ”Ž

Capability Fitz LangChain / LlamaIndex
Source attribution Mandatory Optional
Retrieval trace Explicit & structured Often opaque
Debuggability Built-in Tool-dependent

Every answer in Fitz is fully auditable down to the retrieval step.


Scope & complexity ๐Ÿช

Aspect Fitz LangChain / LlamaIndex
Chains / agents โŽ โœ”
Prompt graphs โŽ โœ”
UI abstractions โŽ Often
Cognitive overhead Very low High

Fitz intentionally does less โ€” so it can be trusted more.


Use Fitz if you want:

  • A replaceable RAG engine, not a framework marriage
  • Strong epistemic guarantees (โ€œI donโ€™t knowโ€ is valid output)
  • Full provenance for every answer
  • A transparent, extensible plugin architecture
  • A future-proof ingestion pipeline that survives engine changes

๐Ÿ“ฆ Features

Swappable RAG Engines ๐Ÿ”„

Your data stays. Your queries stay. Only the engine changes.

       โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
       โ”‚           Your Query                โ”‚
       โ”‚   "What are the payment terms?"     โ”‚
       โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                          โ”‚
                          โ–ผ
       โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
       โ”‚       engine="..."                  โ”‚
       โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚
       โ”‚  โ”‚ fitz    โ”‚ โ”‚ clara โ”‚ โ”‚ graph   โ”‚  โ”‚
       โ”‚  โ”‚  _rag   โ”‚ โ”‚       โ”‚ โ”‚  _rag   โ”‚  โ”‚
       โ”‚  โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”˜  โ”‚
       โ”‚       โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜       โ”‚
       โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                          โ”‚
                          โ–ผ
       โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
       โ”‚       Your Ingested Knowledge       โ”‚
       โ”‚      (unchanged across engines)     โ”‚
       โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
answer = run("What are the payment terms?", engine="fitz_rag")
answer = run("What are the payment terms?", engine="clara")
answer = run("What are the payment terms?", engine="graph_rag")  # future

No migration. No re-ingestion. No new API to learn.


Full Provenance ๐Ÿ—‚๏ธ

Every answer traces back to its source:

Answer: The refund policy allows returns within 30 days...

Sources:
 [1] policies/refund.md [chunk 3] (score: 0.92)
 [2] faq/payments.md [chunk 1] (score: 0.87)

Incremental Ingestion โšก โ†’ Ingestion Guide

Fitz tracks file hashes and only re-ingests what changed:

$ fitz ingest ./src

Scanning... 847 files
 โ†’ 12 new files
 โ†’ 3 modified files
 โ†’ 832 unchanged (skipped)

Ingesting 15 files...

Re-running ingestion on a large codebase takes seconds, not minutes. Changed your chunking config? Fitz detects that too and re-processes affected files.


๐Ÿ“ฆ Plugin Generator โ†’ Plugin Development Guide

Generate plugins with AI ๐Ÿค–

Fitz can generate fully working plugins from natural language descriptions. Describe what you want, and fitz creates, validates, and saves the plugin automatically.

fitz plugin
? Plugin type: chunker
? Description: sentence-based chunker that splits on periods

Generating...
โœ“ Syntax valid
โœ“ Schema valid
โœ“ Plugin loads correctly
โœ“ Functional test passed

Created: ~/.fitz/plugins/chunking/sentence_chunker.py

The generated plugin is immediately usableโ€”no manual editing required.


Supported plugin types

Type Format Description
llm-chat YAML Connect to a chat LLM provider
llm-embedding YAML Connect to an embedding provider
llm-rerank YAML Connect to a reranking provider
vector-db YAML Connect to a vector database
retrieval YAML Define a retrieval strategy
chunker Python Custom document chunking logic
reader Python Custom file format reader
constraint Python Epistemic safety guardrail

How it works

  1. Prompt building: Fitz loads existing plugin examples and schema definitions
  2. Generation: Your configured LLM generates the plugin code
  3. Multi-level validation: Syntax โ†’ Schema โ†’ Integration โ†’ Functional tests
  4. Auto-retry: If validation fails, fitz feeds the error back and retries (up to 3 attempts)
  5. Save: Working plugins are saved to ~/.fitz/plugins/

Generated plugins are auto-discovered by fitz on next runโ€”no registration needed.


Example: Custom chunker

fitz plugin
? Plugin type: chunker
? Description: splits text by paragraphs, keeping code blocks intact

# Creates ~/.fitz/plugins/chunking/paragraph_chunker.py
# Generated plugin is immediately usable
fitz ingest ./docs --chunker paragraph_chunker

๐Ÿ“ฆ Quick Start

CLI

pip install fitz-ai

fitz quickstart ./docs "Your question here"

Fitz auto-detects your LLM provider:

  1. Ollama running? โ†’ Uses it automatically (fully local)
  2. COHERE_API_KEY or OPENAI_API_KEY set? โ†’ Uses it automatically
  3. First time? โ†’ Guides you through free Cohere signup (2 minutes)

After first run, it's completely zero-friction.


Python SDK

import fitz_ai

fitz_ai.ingest("./docs")
answer = fitz_ai.query("Your question here")

print(answer.text)
for source in answer.provenance:
   print(f"  - {source.source_id}: {source.excerpt[:50]}...")

The SDK provides:

  • Module-level functions matching CLI (ingest, query)
  • Auto-config creation (no setup required)
  • Full provenance tracking
  • Same honest RAG as the CLI

For advanced use (multiple collections), use the fitz class directly:

from fitz_ai import fitz

physics = fitz(collection="physics")
physics.ingest("./physics_papers")
answer = physics.query("Explain entanglement")

Fully Local (Ollama)

pip install fitz-ai[local]

ollama pull llama3.2
ollama pull nomic-embed-text

fitz quickstart ./docs "Your question here"

Fitz auto-detects Ollama when running. No API keys neededโ€”no data leaves your machine.


๐Ÿ“ฆ Real-World Usage

Fitz is a foundation. It handles document ingestion and grounded retrievalโ€”you build whatever sits on top: chatbots, dashboards, alerts, or automation.


Chatbot Backend ๐Ÿค–

Connect fitz to Slack, Discord, Teams, or your own UI. One function call returns an answer with sourcesโ€”no hallucinations, full provenance. You handle the conversation flow; fitz handles the knowledge.

Example: A SaaS company plugs fitz into their support bot. Tier-1 questions like "How do I reset my password?" get instant answers. Their support team focuses on edge cases while fitz deflects 60% of incoming tickets.


Internal Knowledge Base ๐Ÿ“–

Point fitz at your company's wiki, policies, and runbooks. Employees ask natural language questions instead of hunting through folders or pinging colleagues on Slack.

Example: A 200-person startup ingests their Notion workspace and compliance docs. New hires find answers to "How do I request PTO?" on day oneโ€”no more waiting for someone in HR to respond.


Continuous Intelligence & Alerting (Watchdog) ๐Ÿถ

Pair fitz with cron, Airflow, or Lambda. Ingest data on a schedule, run queries automatically, trigger alerts when conditions match. Fitz provides the retrieval primitive; you wire the automation.

Example: A security team ingests SIEM logs nightly. Every morning, a scheduled job asks "Were there failed logins from unusual locations?" If fitz finds evidence, an alert fires to the on-call channel before anyone checks email.


Web Knowledge Base ๐ŸŒŽ

Scrape the web with Scrapy, BeautifulSoup, or Playwright. Save to disk, ingest with fitz. The web becomes a queryable knowledge base.

Example: A football analytics hobbyist scrapes Premier League match reports. After ingesting, they ask "How did Arsenal perform against top 6 teams?" or "What tactics did Liverpool use in away games?"โ€”insights that would take hours to compile manually.


Codebase Search ๐Ÿ

Fitz includes built-in AST-aware chunking for code bases. Functions, classes, and modules become individual searchable units with docstrings and imports preserved. Ask questions in natural language; get answers pointing to specific code.

Example: A team inherits a legacy Django monolithโ€”200k lines, sparse docs. They ingest the codebase and ask "Where is user authentication handled?" or "What API endpoints modify the billing table?" New developers onboard in days instead of weeks.


๐Ÿ“ฆ Architecture โ†’ Full Architecture Guide
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                         fitz-ai                               โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  User Interfaces                                              โ”‚
โ”‚  CLI: quickstart | init | ingest | query | chat | serve       โ”‚
โ”‚  SDK: fitz_ai.fitz() โ†’ ingest() โ†’ ask()                       โ”‚
โ”‚  API: /query | /chat | /ingest | /collections | /health       โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Engines                                                      โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                 โ”‚
โ”‚  โ”‚  FitzRAG  โ”‚  โ”‚   CLaRa   โ”‚  โ”‚  GraphRAG  โ”‚  (pluggable)    โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                 โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Plugin System (all YAML-defined)                             โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”             โ”‚
โ”‚  โ”‚  Chat  โ”‚ โ”‚ Embedding โ”‚ โ”‚ Rerank โ”‚ โ”‚ VectorDB โ”‚             โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜             โ”‚
โ”‚  openai, cohere, anthropic, ollama, azure...                  โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Retrieval Pipelines (plugin choice controls features)        โ”‚
โ”‚  dense (no rerank) | dense_rerank (with rerank)               โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Enrichment (baked in via ChunkEnricher)                      โ”‚
โ”‚  summaries | keywords | entities | hierarchical summaries     โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Constraints (epistemic safety)                               โ”‚
โ”‚  ConflictAware | InsufficientEvidence | CausalAttribution     โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ“ฆ CLI Reference โ†’ Full CLI Guide
fitz quickstart [PATH] [QUESTION]    # Zero-config RAG (start here)
fitz init                            # Interactive setup wizard
fitz ingest                          # Interactive ingestion
fitz query                           # Single question with sources
fitz chat                            # Multi-turn conversation with your knowledge base
fitz collections                     # List and delete knowledge collections
fitz keywords                        # Manage keyword vocabulary for exact matching
fitz plugin                          # Generate plugins with AI
fitz serve                           # Start REST API server
fitz config                          # View/edit configuration
fitz doctor                          # System diagnostics

๐Ÿ“ฆ Python SDK Reference โ†’ Full SDK Guide

Simple usage (module-level, matches CLI):

import fitz_ai

fitz_ai.ingest("./docs")
answer = fitz_ai.query("What is the refund policy?")
print(answer.text)

Advanced usage (multiple collections):

from fitz_ai import fitz

# Create separate instances for different collections
physics = fitz(collection="physics")
physics.ingest("./physics_papers")

legal = fitz(collection="legal")
legal.ingest("./contracts")

# Query each collection
physics_answer = physics.query("Explain entanglement")
legal_answer = legal.query("What are the payment terms?")

Working with answers:

answer = fitz_ai.query("What is the refund policy?")

print(answer.text)
print(answer.mode)  # CONFIDENT, QUALIFIED, DISPUTED, or ABSTAIN

for source in answer.provenance:
    print(f"Source: {source.source_id}")
    print(f"Excerpt: {source.excerpt}")

๐Ÿ“ฆ REST API Reference โ†’ Full API Guide

Start the server:

pip install fitz-ai[api]

fitz serve                    # localhost:8000
fitz serve -p 3000            # custom port
fitz serve --host 0.0.0.0     # all interfaces

Interactive docs: Visit http://localhost:8000/docs for Swagger UI.


Endpoints:

Method Endpoint Description
POST /query Query knowledge base
POST /chat Multi-turn chat (stateless)
POST /ingest Ingest documents from path
GET /collections List all collections
GET /collections/{name} Get collection stats
DELETE /collections/{name} Delete a collection
GET /health Health check

Example requests:

# Query
curl -X POST http://localhost:8000/query \
  -H "Content-Type: application/json" \
  -d '{"question": "What is the refund policy?", "collection": "default"}'

# Ingest
curl -X POST http://localhost:8000/ingest \
  -H "Content-Type: application/json" \
  -d '{"source": "./docs", "collection": "mydata"}'

# Chat (stateless - client manages history)
curl -X POST http://localhost:8000/chat \
  -H "Content-Type: application/json" \
  -d '{
    "message": "What about returns?",
    "history": [
      {"role": "user", "content": "What is the refund policy?"},
      {"role": "assistant", "content": "The refund policy allows..."}
    ],
    "collection": "default"
  }'

๐Ÿ“ฆ Beyond RAG

RAG is a method. Knowledge access is a strategy.

Fitz is not a RAG framework. It's a knowledge platform that currently uses RAG as its primary engine.

from fitz_ai import run

# Fitz RAG - fast, reliable vector search
answer = run("What are the payment terms?", engine="fitz_rag")

# CLaRa - compressed RAG, 16x smaller context
answer = run("What are the payment terms?", engine="clara")

# GraphRAG - knowledge graph with entity extraction and community summaries
answer = run("What are the payment terms?", engine="graphrag")

The engine is an implementation detail. Your ingested knowledge, your queries, your workflowโ€”all stay the same. When a better retrieval paradigm emerges, swap one line, not your entire codebase.


๐Ÿ“ฆ Philosophy

Principles:

  • Explicit over clever: No magic. Read the config, know what happens.
  • Answers over architecture: Optimize for time-to-insight, not flexibility.
  • Honest over helpful: Better to say "I don't know" than hallucinate.
  • Files over frameworks: YAML plugins over class hierarchies.

License

MIT


Links

Documentation:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fitz_ai-0.5.1.tar.gz (528.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fitz_ai-0.5.1-py3-none-any.whl (563.6 kB view details)

Uploaded Python 3

File details

Details for the file fitz_ai-0.5.1.tar.gz.

File metadata

  • Download URL: fitz_ai-0.5.1.tar.gz
  • Upload date:
  • Size: 528.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for fitz_ai-0.5.1.tar.gz
Algorithm Hash digest
SHA256 98a7e64d5253e8387b236962895aa421d46e84ac669614c79b8b45598c4f18fb
MD5 cfbb284a69e06d3367e3fdc9e5cf0b0a
BLAKE2b-256 f885b252909260ed22f4e7b9dfbd93c32dc725ceb1653832b1e9bed598ad481b

See more details on using hashes here.

File details

Details for the file fitz_ai-0.5.1-py3-none-any.whl.

File metadata

  • Download URL: fitz_ai-0.5.1-py3-none-any.whl
  • Upload date:
  • Size: 563.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for fitz_ai-0.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 34716f5c8c09de77752c5bedd41cf4c4ebf3792864d9a5ca08752cea64a0efa5
MD5 2d37894907e4eb2ccb386d9b34373301
BLAKE2b-256 71729c8d81cf1f81931ebf50432de6b7839cd71de70c6c800f8e6d23a8b25470

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page