Skip to main content

Interactive terminal UI for visualizing, debugging, and tuning RAG chunking pipelines

Project description

๐Ÿš€ RAG-TUI v0.0.1 Beta

The Terminal App That Makes Chunking Actually Fun

"I used to stare at my RAG pipeline wondering why it sucked. Then I found RAG-TUI and realized my chunks were the size of War and Peace." - A Developer, Probably


๐ŸŽญ What Even Is This?

RAG-TUI is a beautiful terminal-based debugger for your Retrieval-Augmented Generation (RAG) pipelines. It's like having X-ray vision for your text chunking.

You know that feeling when your LLM hallucinates because your retrieval returned garbage? Yeah, this fixes that.

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  RAG-TUI v0.0.1 Beta                                            โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  ๐Ÿ“ Input   ๐ŸŽจ Chunks   ๐Ÿ” Search   ๐Ÿ“Š Batch   โš™๏ธ Settings      โ”‚
โ”‚                                                                 โ”‚
โ”‚  Your text, but now with โœจ colors โœจ and ๐Ÿ“Š metrics ๐Ÿ“Š        โ”‚
โ”‚                                                                 โ”‚
โ”‚  Chunk Size: [โ—€] [200] [โ–ถ]  โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘  tokens              โ”‚
โ”‚  Overlap:    [โ—€] [10]  [โ–ถ]  โ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘  %                   โ”‚
โ”‚                                                                 โ”‚
โ”‚  โšก 5 chunks                                                    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ“ธ Screenshots

Input Tab Chunks Tab
Input Chunks
Search Tab Chat Tab
Search Chat

๐Ÿค” Why Should I Care?

The Problem

You're building a RAG app. You chunk your documents. You embed them. You search. And then...

User: "What's the company's refund policy?"
LLM: "Based on the context, your refrigerator appears to be running."

The Solution

See exactly how your text is being chunked. Tweak parameters in real-time. Test queries. Export settings. Actually understand what's going on.


โšก Quick Start (30 Seconds, I Promise)

Install

pip install rag-tui

Run

rag-tui

That's It

No really. You're done. Press L to load sample text and start playing.


๐ŸŽจ Features (The Good Stuff)

1. ๐Ÿงฉ Six Chunking Strategies

Because one size definitely does NOT fit all.

Strategy Best For Vibe
Token General text "I count tokens for breakfast"
Sentence Articles, docs "Periods are sacred"
Paragraph Structured text "Double newline gang"
Recursive Code, mixed "I'll try everything"
Fixed Speed demons "Just cut every 500 chars lol"
Custom You, apparently "I know better" (you might!)

2. ๐Ÿ”Œ Four LLM Providers

Switch between providers like you switch between tabs (too often).

# Ollama (Free! Local! Private!)
ollama serve
rag-tui

# OpenAI (When you need that GPT juice)
export OPENAI_API_KEY="sk-..." 
rag-tui

# Groq (FAST. LIKE, REALLY FAST.)
export GROQ_API_KEY="gsk_..."
rag-tui

# Google Gemini (Free tier FTW)
export GOOGLE_API_KEY="AI..."
rag-tui

3. ๐Ÿ“ Load Any File

PDFs? โœ… Markdown? โœ… Python? โœ… That random .txt file from 2019? โœ…

Supported: .txt, .md, .py, .js, .json, .yaml, .pdf, and 10 more!

4. ๐Ÿ“Š Batch Testing

Test 50 queries at once. See which ones fail. Cry. Fix. Repeat.

๐Ÿ“Š Batch Test Results
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”
Total Queries: 50
Hit Rate (>0.5): 78%
Avg Top Score: 0.72

You're doing better than average!
(The average is made up, but still, congrats!)

5. โšก Built-in Presets

Don't know what settings to use? We got you.

Preset Size Overlap For
Q&A Retrieval 200 15% Chatbots, search
Document Summary 500 5% Long docs
Code Analysis 300 20% Source code
Long Context 800 10% GPT-4-128k users
High Precision 100 25% When you NEED accuracy

6. ๐Ÿ“‹ Export Settings

Take your carefully tuned settings and use them in production.

# LangChain Export
from langchain.text_splitter import RecursiveCharacterTextSplitter

splitter = RecursiveCharacterTextSplitter(
    chunk_size=800,
    chunk_overlap=80,
)

# LlamaIndex Export  
from llama_index.core.node_parser import SentenceSplitter

parser = SentenceSplitter(
    chunk_size=800,
    chunk_overlap=80,
)

๐ŸŽฎ How To Use It

The Interface

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Strategy: [Token โ–ผ]  โ”‚  File: [path...]  โ”‚  [๐Ÿ“‚ Load]         โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  ๐Ÿ“ Input โ”‚ ๐ŸŽจ Chunks โ”‚ ๐Ÿ” Search โ”‚ ๐Ÿ“Š Batch โ”‚ โš™๏ธ Settings โ”‚ ๐Ÿ’ฌ โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                                 โ”‚
โ”‚                    (Your content here)                          โ”‚
โ”‚                                                                 โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Chunk Size: [โ—€] [200] [โ–ถ]  โ”‚  Overlap: [โ—€] [10] [โ–ถ] %          โ”‚
โ”‚  โšก 5 chunks                                                    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

The Tabs

Tab What It Does When To Use It
๐Ÿ“ Input Paste or load your document First
๐ŸŽจ Chunks See colorful chunk cards To see the magic
๐Ÿ” Search Query and see what comes back Testing retrieval
๐Ÿ“Š Batch Test many queries at once Before production
โš™๏ธ Settings Export config, custom code When you're done
๐Ÿ’ฌ Chat Talk to your chunks For fun

Keyboard Shortcuts

Key Action Pro Tip
Q Quit When you're done procrastinating
L Load sample Start here if confused
R Rechunk After changing params
D Dark/Light mode We default to dark (obviously)
E Export config Save your precious settings
F1 Help When this README isn't enough
Tab Next tab Navigate like a pro

๐Ÿ”ง The Workflow (AKA What To Actually Do)

Step 1: Load Your Document

Either:

  • Type/paste in the ๐Ÿ“ Input tab
  • Enter a file path and click ๐Ÿ“‚ Load
  • Press L for sample text (recommended for newbies)

Step 2: Pick Your Strategy

Use the dropdown at the top. If unsure:

  • Text/articles? โ†’ Sentence
  • Code? โ†’ Recursive
  • Don't know? โ†’ Token (it's the safe choice)

Step 3: Adjust Parameters

In the ๐ŸŽจ Chunks tab, use the sliders:

  • Chunk Size: How big each chunk should be (in tokens)
  • Overlap: How much chunks should share (prevents context loss)

The Golden Rule: Smaller chunks = more precise, less context. Bigger chunks = more context, less precise.

Step 4: Test Your Queries

Go to ๐Ÿ” Search tab:

  1. Type a question
  2. Click Search
  3. See what chunks come back
  4. Cry or celebrate accordingly

Step 5: Batch Test (The Pro Move)

Go to ๐Ÿ“Š Batch tab:

  1. Enter multiple queries (one per line)
  2. Click Run Batch Test
  3. See your hit rate
  4. Adjust until it's good enoughโ„ข

Step 6: Export

Go to โš™๏ธ Settings tab:

  • Click JSON, LangChain, or LlamaIndex
  • Copy the generated code
  • Paste in your project
  • Deploy
  • Profit???

๐Ÿงช Custom Chunking (For The Brave)

Don't like our strategies? Roll your own!

Go to โš™๏ธ Settings tab, paste a function like:

def chunk_by_headers(text, chunk_size, overlap):
    """Split on markdown headers."""
    import re
    sections = re.split(r'\n(?=#{1,3} )', text)
    return [(s, 0, len(s)) for s in sections if s.strip()]

Click โšก Apply Custom Chunker and watch the magic.


๐Ÿค– Provider Setup

Ollama (Recommended for Privacy)

# Install Ollama
brew install ollama  # macOS
# or download from ollama.ai

# Pull required models
ollama pull nomic-embed-text  # For embeddings
ollama pull llama3.2:1b       # For chat (small & fast)

# Start the server
ollama serve

OpenAI

export OPENAI_API_KEY="sk-your-key-here"

Uses: text-embedding-3-small + gpt-4o-mini

Groq (Free Tier!)

export GROQ_API_KEY="gsk_your-key-here"

Uses: llama-3.1-8b-instant (NO embeddings - pair with Ollama)

Google Gemini (Also Free Tier!)

export GOOGLE_API_KEY="your-key-here"

Uses: text-embedding-004 + gemini-1.5-flash


๐Ÿ“ˆ Understanding The Metrics

In Chunks Tab

๐Ÿ“Š 5 chunks | Avg: 180 chars | Total: 900 chars | ~225 tokens
  • 5 chunks: Your document was split into 5 pieces
  • Avg: 180 chars: Each chunk is ~180 characters
  • Total: 900 chars: Your whole document size
  • ~225 tokens: Estimated token count (chars รท 4)

In Search Tab

#1 โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘ 0.89  "The refund policy states..."
#2 โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘ 0.72  "For returns within 30 days..."
#3 โ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘ 0.45  "Our customer service team..."
  • #1, #2, #3: Ranking by relevance
  • โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘: Visual similarity bar
  • 0.89: Cosine similarity score (0-1, higher = better)

In Batch Tab

Hit Rate (>0.5): 78%
Avg Top Score: 0.72
  • Hit Rate: % of queries where top result scored > 0.5
  • Avg Top Score: Average of all top-1 scores

As a rule of thumb:

  • Hit Rate > 80% = Great
  • Hit Rate 60-80% = Acceptable
  • Hit Rate < 60% = Time to tune

๐Ÿ› Troubleshooting

"Ollama not available"

# Is Ollama running?
ollama serve

# Did you pull the models?
ollama pull nomic-embed-text
ollama pull llama3.2:1b

"No chunks"

  • Is your text too short?
  • Is chunk size bigger than your text?
  • Try lowering chunk size to 50

"Search returns garbage"

  • Check if embeddings were created (needs Ollama/OpenAI)
  • Try a different chunking strategy
  • Lower chunk size for more precision

"App looks weird"

# Reset terminal
reset

# Try a different terminal (iTerm2, Warp, etc.)

๐ŸŽ“ Chunking 101 (The Theory)

Why Chunk At All?

LLMs have context limits. Your document is bigger than the limit. So we split it up, find the relevant parts, and only send those.

Your 50-page PDF โ†’ Split into 100 chunks โ†’ Search โ†’ Top 3 sent to LLM โ†’ Answer!

The Size-Precision Tradeoff

Chunk Size Precision Context Best For
Small (50-100) High โœ… Low โŒ Specific facts
Medium (200-400) Medium Medium General Q&A
Large (500-1000) Low โŒ High โœ… Summaries

The Overlap Question

Overlap = how many tokens chunks share at boundaries.

  • 0% overlap: Chunks are completely separate (risk: losing context at boundaries)
  • 10-20% overlap: Goldilocks zone (recommended)
  • 50% overlap: Lots of redundancy (wastes tokens but very safe)

๐Ÿ“ฆ Programmatic Usage

Don't want the TUI? Use the library directly:

from rag_tui.core import ChunkingEngine, StrategyType

# Create engine
engine = ChunkingEngine()
engine.set_strategy(StrategyType.SENTENCE)

# Chunk some text
chunks = engine.chunk_text(
    "Your document here...",
    chunk_size=200,
    overlap=20
)

for text, start, end in chunks:
    print(f"[{start}:{end}] {text[:50]}...")

Use Providers Directly

from rag_tui.core.providers import get_provider, ProviderType

# Get Ollama provider
provider = get_provider(ProviderType.OLLAMA)

# Check connection
if await provider.check_connection():
    # Embed text
    embedding = await provider.embed("Hello world")
    
    # Generate response
    response = await provider.generate("What is RAG?")

๐Ÿค Contributing

Found a bug? Have an idea? Want to add support for Claude/Anthropic?

  1. Fork the repo
  2. Create a branch
  3. Make your changes
  4. Submit a PR
  5. Get famous (in our small community)

๐Ÿ“œ License

MIT License - Do whatever you want, just don't blame us if your RAG app becomes sentient.


๐Ÿ™ Credits

Built with:

  • Textual - The TUI framework that makes terminals beautiful
  • Chonkie - Token-based chunking
  • Usearch - Blazing fast vector search
  • Ollama - Local LLM inference

๐Ÿ’ญ Final Words

RAG is hard. Chunking is an art. But with RAG-TUI, at least you can see what you're doing wrong.

Now go forth and chunk responsibly! ๐ŸŽฏ


Made with โค๏ธ and too much โ˜• for RAG developers everywhere

"May your chunks be small and your retrieval be accurate."

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rag_tui-0.0.1b1.tar.gz (432.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rag_tui-0.0.1b1-py3-none-any.whl (45.0 kB view details)

Uploaded Python 3

File details

Details for the file rag_tui-0.0.1b1.tar.gz.

File metadata

  • Download URL: rag_tui-0.0.1b1.tar.gz
  • Upload date:
  • Size: 432.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.8

File hashes

Hashes for rag_tui-0.0.1b1.tar.gz
Algorithm Hash digest
SHA256 62195da0c7362439c1b5aba8b6a0e5f1780e0cadfd835f56229d0c9b69f6ea08
MD5 bb777ead06bb78192f471679eda716cb
BLAKE2b-256 7c616a05ed93998151e656aa2e76e989584b3697e3a70acf2a85b83b34b287f1

See more details on using hashes here.

File details

Details for the file rag_tui-0.0.1b1-py3-none-any.whl.

File metadata

  • Download URL: rag_tui-0.0.1b1-py3-none-any.whl
  • Upload date:
  • Size: 45.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.8

File hashes

Hashes for rag_tui-0.0.1b1-py3-none-any.whl
Algorithm Hash digest
SHA256 ea6b430c182ba49e55d54cbc098ec4b74370e657734d6e36ca5dc9109a1da95d
MD5 4dbc43679c5cad736abec568d1bb313d
BLAKE2b-256 6d2682bb472ae868aff7d069121df58fb56ee64a90ec40923db0a1eecb3b00e7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page