Interactive terminal UI for visualizing, debugging, and tuning RAG chunking pipelines
Project description
๐ RAG-TUI v0.0.2 Beta
The Terminal App That Makes Chunking Actually Fun
"I used to stare at my RAG pipeline wondering why it sucked. Then I found RAG-TUI and realized my chunks were the size of War and Peace." - A Developer, Probably
๐ญ What Even Is This?
RAG-TUI is a beautiful terminal-based debugger for your Retrieval-Augmented Generation (RAG) pipelines. It's like having X-ray vision for your text chunking.
You know that feeling when your LLM hallucinates because your retrieval returned garbage? Yeah, this fixes that.
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ RAG-TUI v0.0.2 Beta โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ ๐ Input ๐จ Chunks ๐ Search ๐ Batch โ๏ธ Settings โ
โ โ
โ Your text, but now with โจ colors โจ and ๐ metrics ๐ โ
โ โ
โ Chunk Size: [โ] [200] [โถ] โโโโโโโโโโโโโโ tokens โ
โ Overlap: [โ] [10] [โถ] โโโโโโโโโโโโโโ % โ
โ โ
โ โก 5 chunks โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐ธ Screenshots
| Input Tab | Chunks Tab |
|---|---|
| Search Tab | Chat Tab |
|---|---|
๐ค Why Should I Care?
The Problem
You're building a RAG app. You chunk your documents. You embed them. You search. And then...
User: "What's the company's refund policy?"
LLM: "Based on the context, your refrigerator appears to be running."
The Solution
See exactly how your text is being chunked. Tweak parameters in real-time. Test queries. Export settings. Actually understand what's going on.
โก Quick Start (30 Seconds, I Promise)
Install
pip install rag-tui
Run
rag-tui
That's It
No really. You're done. Press L to load sample text and start playing.
๐จ Features (The Good Stuff)
1. ๐งฉ Six Chunking Strategies
Because one size definitely does NOT fit all.
| Strategy | Best For | Vibe |
|---|---|---|
| Token | General text | "I count tokens for breakfast" |
| Sentence | Articles, docs | "Periods are sacred" |
| Paragraph | Structured text | "Double newline gang" |
| Recursive | Code, mixed | "I'll try everything" |
| Fixed | Speed demons | "Just cut every 500 chars lol" |
| Custom | You, apparently | "I know better" (you might!) |
2. ๐ Four LLM Providers
Switch between providers like you switch between tabs (too often).
# Ollama (Free! Local! Private!)
ollama serve
rag-tui
# OpenAI (When you need that GPT juice)
export OPENAI_API_KEY="sk-..."
rag-tui
# Groq (FAST. LIKE, REALLY FAST.)
export GROQ_API_KEY="gsk_..."
rag-tui
# Google Gemini (Free tier FTW)
export GOOGLE_API_KEY="AI..."
rag-tui
3. ๐ Load Any File
PDFs? โ
Markdown? โ
Python? โ
That random .txt file from 2019? โ
Supported: .txt, .md, .py, .js, .json, .yaml, .pdf, and 10 more!
4. ๐ Batch Testing
Test 50 queries at once. See which ones fail. Cry. Fix. Repeat.
๐ Batch Test Results
โโโโโโโโโโโโโโโโโโโโ
Total Queries: 50
Hit Rate (>0.5): 78%
Avg Top Score: 0.72
You're doing better than average!
(The average is made up, but still, congrats!)
5. โก Built-in Presets
Don't know what settings to use? We got you.
| Preset | Size | Overlap | For |
|---|---|---|---|
| Q&A Retrieval | 200 | 15% | Chatbots, search |
| Document Summary | 500 | 5% | Long docs |
| Code Analysis | 300 | 20% | Source code |
| Long Context | 800 | 10% | GPT-4-128k users |
| High Precision | 100 | 25% | When you NEED accuracy |
6. ๐ Export Settings
Take your carefully tuned settings and use them in production.
# LangChain Export
from langchain.text_splitter import RecursiveCharacterTextSplitter
splitter = RecursiveCharacterTextSplitter(
chunk_size=800,
chunk_overlap=80,
)
# LlamaIndex Export
from llama_index.core.node_parser import SentenceSplitter
parser = SentenceSplitter(
chunk_size=800,
chunk_overlap=80,
)
๐ฎ How To Use It
The Interface
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Strategy: [Token โผ] โ File: [path...] โ [๐ Load] โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ ๐ Input โ ๐จ Chunks โ ๐ Search โ ๐ Batch โ โ๏ธ Settings โ ๐ฌ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ (Your content here) โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Chunk Size: [โ] [200] [โถ] โ Overlap: [โ] [10] [โถ] % โ
โ โก 5 chunks โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
The Tabs
| Tab | What It Does | When To Use It |
|---|---|---|
| ๐ Input | Paste or load your document | First |
| ๐จ Chunks | See colorful chunk cards | To see the magic |
| ๐ Search | Query and see what comes back | Testing retrieval |
| ๐ Batch | Test many queries at once | Before production |
| โ๏ธ Settings | Export config, custom code | When you're done |
| ๐ฌ Chat | Talk to your chunks | For fun |
Keyboard Shortcuts
| Key | Action | Pro Tip |
|---|---|---|
Q |
Quit | When you're done procrastinating |
L |
Load sample | Start here if confused |
R |
Rechunk | After changing params |
D |
Dark/Light mode | We default to dark (obviously) |
E |
Export config | Save your precious settings |
F1 |
Help | When this README isn't enough |
Tab |
Next tab | Navigate like a pro |
๐ง The Workflow (AKA What To Actually Do)
Step 1: Load Your Document
Either:
- Type/paste in the ๐ Input tab
- Enter a file path and click ๐ Load
- Press
Lfor sample text (recommended for newbies)
Step 2: Pick Your Strategy
Use the dropdown at the top. If unsure:
- Text/articles? โ Sentence
- Code? โ Recursive
- Don't know? โ Token (it's the safe choice)
Step 3: Adjust Parameters
In the ๐จ Chunks tab, use the sliders:
- Chunk Size: How big each chunk should be (in tokens)
- Overlap: How much chunks should share (prevents context loss)
The Golden Rule: Smaller chunks = more precise, less context. Bigger chunks = more context, less precise.
Step 4: Test Your Queries
Go to ๐ Search tab:
- Type a question
- Click Search
- See what chunks come back
- Cry or celebrate accordingly
Step 5: Batch Test (The Pro Move)
Go to ๐ Batch tab:
- Enter multiple queries (one per line)
- Click Run Batch Test
- See your hit rate
- Adjust until it's good enoughโข
Step 6: Export
Go to โ๏ธ Settings tab:
- Click JSON, LangChain, or LlamaIndex
- Copy the generated code
- Paste in your project
- Deploy
- Profit???
๐งช Custom Chunking (For The Brave)
Don't like our strategies? Roll your own!
Go to โ๏ธ Settings tab, paste a function like:
def chunk_by_headers(text, chunk_size, overlap):
"""Split on markdown headers."""
import re
sections = re.split(r'\n(?=#{1,3} )', text)
return [(s, 0, len(s)) for s in sections if s.strip()]
Click โก Apply Custom Chunker and watch the magic.
๐ค Provider Setup
Ollama (Recommended for Privacy)
# Install Ollama
brew install ollama # macOS
# or download from ollama.ai
# Pull required models
ollama pull nomic-embed-text # For embeddings
ollama pull llama3.2:1b # For chat (small & fast)
# Start the server
ollama serve
OpenAI
export OPENAI_API_KEY="sk-your-key-here"
Uses: text-embedding-3-small + gpt-4o-mini
Groq (Free Tier!)
export GROQ_API_KEY="gsk_your-key-here"
Uses: llama-3.1-8b-instant (NO embeddings - pair with Ollama)
Google Gemini (Also Free Tier!)
export GOOGLE_API_KEY="your-key-here"
Uses: text-embedding-004 + gemini-1.5-flash
๐ Understanding The Metrics
In Chunks Tab
๐ 5 chunks | Avg: 180 chars | Total: 900 chars | ~225 tokens
- 5 chunks: Your document was split into 5 pieces
- Avg: 180 chars: Each chunk is ~180 characters
- Total: 900 chars: Your whole document size
- ~225 tokens: Estimated token count (chars รท 4)
In Search Tab
#1 โโโโโโโโโโ 0.89 "The refund policy states..."
#2 โโโโโโโโโโ 0.72 "For returns within 30 days..."
#3 โโโโโโโโโโ 0.45 "Our customer service team..."
- #1, #2, #3: Ranking by relevance
- โโโโโโโโโโ: Visual similarity bar
- 0.89: Cosine similarity score (0-1, higher = better)
In Batch Tab
Hit Rate (>0.5): 78%
Avg Top Score: 0.72
- Hit Rate: % of queries where top result scored > 0.5
- Avg Top Score: Average of all top-1 scores
As a rule of thumb:
- Hit Rate > 80% = Great
- Hit Rate 60-80% = Acceptable
- Hit Rate < 60% = Time to tune
๐ Troubleshooting
"Ollama not available"
# Is Ollama running?
ollama serve
# Did you pull the models?
ollama pull nomic-embed-text
ollama pull llama3.2:1b
"No chunks"
- Is your text too short?
- Is chunk size bigger than your text?
- Try lowering chunk size to 50
"Search returns garbage"
- Check if embeddings were created (needs Ollama/OpenAI)
- Try a different chunking strategy
- Lower chunk size for more precision
"App looks weird"
# Reset terminal
reset
# Try a different terminal (iTerm2, Warp, etc.)
๐ Chunking 101 (The Theory)
Why Chunk At All?
LLMs have context limits. Your document is bigger than the limit. So we split it up, find the relevant parts, and only send those.
Your 50-page PDF โ Split into 100 chunks โ Search โ Top 3 sent to LLM โ Answer!
The Size-Precision Tradeoff
| Chunk Size | Precision | Context | Best For |
|---|---|---|---|
| Small (50-100) | High โ | Low โ | Specific facts |
| Medium (200-400) | Medium | Medium | General Q&A |
| Large (500-1000) | Low โ | High โ | Summaries |
The Overlap Question
Overlap = how many tokens chunks share at boundaries.
- 0% overlap: Chunks are completely separate (risk: losing context at boundaries)
- 10-20% overlap: Goldilocks zone (recommended)
- 50% overlap: Lots of redundancy (wastes tokens but very safe)
๐ฆ Programmatic Usage
Don't want the TUI? Use the library directly:
from rag_tui.core import ChunkingEngine, StrategyType
# Create engine
engine = ChunkingEngine()
engine.set_strategy(StrategyType.SENTENCE)
# Chunk some text
chunks = engine.chunk_text(
"Your document here...",
chunk_size=200,
overlap=20
)
for text, start, end in chunks:
print(f"[{start}:{end}] {text[:50]}...")
Use Providers Directly
from rag_tui.core.providers import get_provider, ProviderType
# Get Ollama provider
provider = get_provider(ProviderType.OLLAMA)
# Check connection
if await provider.check_connection():
# Embed text
embedding = await provider.embed("Hello world")
# Generate response
response = await provider.generate("What is RAG?")
๐ค Contributing
Found a bug? Have an idea? Want to add support for Claude/Anthropic?
- Fork the repo
- Create a branch
- Make your changes
- Submit a PR
- Get famous (in our small community)
๐ License
MIT License - Do whatever you want, just don't blame us if your RAG app becomes sentient.
๐ Credits
Built with:
- Textual - The TUI framework that makes terminals beautiful
- Chonkie - Token-based chunking
- Usearch - Blazing fast vector search
- Ollama - Local LLM inference
๐ญ Final Words
RAG is hard. Chunking is an art. But with RAG-TUI, at least you can see what you're doing wrong.
Now go forth and chunk responsibly! ๐ฏ
Made with โค๏ธ and too much โ for RAG developers everywhere
"May your chunks be small and your retrieval be accurate."
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rag_tui-0.0.2b1.tar.gz.
File metadata
- Download URL: rag_tui-0.0.2b1.tar.gz
- Upload date:
- Size: 431.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b31eb2a134f63c79947df421ce081d9d386b3d93692786df757d5399f871781d
|
|
| MD5 |
061c2767d564c2b95c44bf97a9b43a3d
|
|
| BLAKE2b-256 |
ee092acbb34153b8a62e12105727152208f37cd04831c55065bd98c5ddcf1a9b
|
File details
Details for the file rag_tui-0.0.2b1-py3-none-any.whl.
File metadata
- Download URL: rag_tui-0.0.2b1-py3-none-any.whl
- Upload date:
- Size: 46.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ca863367ad7f88c42c70c3630145640b7aa62c945bb23ca2a7e1d668cae161a5
|
|
| MD5 |
2c088aaaa5d1c560e7f1fdfdf4da93c7
|
|
| BLAKE2b-256 |
c084323b0c3d834c89b8fe671b98aa38e6fa67ebfaf8049789a189cc280ba259
|