A visual sandbox for experimenting with RAG configurations

These details have not been verified by PyPI

Project links

Project description

Unravel

A visual sandbox for experimenting with RAG (Retrieval-Augmented Generation) configurations.

Overview

Unravel helps developers understand and optimize their RAG pipelines through interactive visualizations. Experiment with document parsing, chunking strategies, embedding models, and retrieval configurations—all running locally on your machine.

Installation

From PyPI

# Using pip
pip install unravel

# Or using uv (faster)
uv pip install unravel

From Source

git clone https://github.com/unravel/unravel.git
cd unravel

# Create virtual environment
uv venv

# Activate virtual environment
# Windows
.venv\Scripts\activate
# macOS/Linux
source .venv/bin/activate

# Install in editable mode
uv sync

Development Dependencies

For contributing to the project, install development tools:

uv sync --all-extras

Note: This project uses uv for fast, reliable Python package management. Install uv: powershell -c "irm https://astral.sh/uv/install.ps1 | iex" (Windows) or see uv installation docs.

Usage

Simply run the CLI command to launch the Streamlit app:

unravel

This opens the app in your browser at http://localhost:8501.

API Key Setup

To use LLM features (query generation with OpenAI, Anthropic, etc.), configure your API keys:

Run unravel once to create the configuration directory
Navigate to ~/.unravel/ (or %USERPROFILE%\.unravel\ on Windows)

Edit the .env file and add your API key:

# For OpenAI
OPENAI_API_KEY=sk-your-key-here

# For Anthropic
ANTHROPIC_API_KEY=sk-ant-your-key-here

Save the file and refresh the app

Note: API keys are stored securely in .env and never saved to session files. For local models (Ollama, LM Studio), no API key is required.

Running as a Python Module

uv run python -m unravel

Features

Unravel guides you through a structured 5-step pipeline for building and testing RAG systems:

Step 1: Document Upload 📄

Advanced multi-format document ingestion with file upload and URL scraping:

📁 File Upload: PDF, DOCX, PPTX, XLSX, HTML, Markdown, TXT, and Images (PNG, JPG, BMP, TIFF)
🌐 URL Scraping: Extract content from web pages with JavaScript rendering support
- Crawl Modes: Single page scraping or multi-page crawling
- Discovery Methods: Crawler (follows internal links), Sitemap, or Feeds (RSS/Atom)
- Extraction Modes: Balanced, Favor Precision, or Favor Recall
- Output Formats: Markdown, TXT, CSV, JSON, HTML, XML, XML-TEI
- Metadata Extraction: Author, publication date, description, tags/categories, site name
- Advanced Options: robots.txt compliance, language filtering, content options (links, images, tables, formatting)
🔍 Advanced Parsing: OCR support for scanned documents and images
🏗️ Structure Preservation: Intelligent table structure extraction and hierarchical layout understanding
🎯 Content Filtering: Selective extraction by element type (headers, footers, code blocks, etc.)
⚙️ Configurable Processing: Thread settings and OCR options for optimal performance

Step 2: Chunk Visualization ✂️

Flexible text splitting with transparent, visual chunk inspection:

🔀 Two Chunking Strategies:
- Hierarchical Chunker: One chunk per document element, preserving semantic structure
- Hybrid Chunker: Token-aware splitting with configurable limits and overlap for consistency
🎚️ Token-Aware Configuration: Set maximum chunk size (default: 512 tokens) and overlap percentage
🏷️ Rich Metadata: Attach configurable metadata to each chunk (section hierarchy, element type, token count, heading text, page numbers)
🎴 Visual Chunk Cards: Display full text with section breadcrumbs, metadata badges, and overlap highlighting
👁️ Context Preview: Expandable previews showing how chunks overlap with neighbors for seamless retrieval

Step 3: Embedding Explorer 🧭

Visualize and analyze your document embeddings:

🤖 10+ Embedding Models: Choose from sentence-transformers models (all-MiniLM, all-mpnet, paraphrase variants), multilingual, QA-optimized, and BGE embeddings
⚡ Fast Startup: Lazy model loading ensures snappy UI responsiveness
🚀 GPU Acceleration: Automatic CUDA detection and acceleration when available
📊 3D UMAP Visualization: Interactive 3D scatter plot showing embedding space with cluster analysis
🎨 Color Coding Options: Visualize by KMeans clustering to identify semantic groupings
🔴 Outlier Detection: Identify and analyze outlier chunks in the embedding space
🔬 Detailed Inspection: Hover over points to preview chunks, click to view full details
🔎 Semantic Search: Test similarity search within embeddings with visual query point projection

Step 4: Query Testing 🔍

End-to-end RAG testing with multiple retrieval strategies:

🎯 Three Retrieval Methods:
- Dense (Qdrant): Vector similarity search using embeddings for semantic matching
- Sparse (BM25): Keyword-based search for exact term matching
- Hybrid: Combines dense and sparse with configurable fusion methods (weighted sum or reciprocal rank fusion)
🔄 Query Expansion: Generate multiple query variations with LLM to improve retrieval coverage (Reciprocal Rank Fusion for result merging)
⚙️ Retrieval Configuration: Adjust Top K results and minimum similarity score thresholds (strategy-specific defaults)
📈 Reranking (Optional): Cross-encoder reranking to re-score and improve retrieval relevance
🤖 LLM Answer Generation: Integrate OpenAI, Anthropic, or local models (Ollama, LM Studio) to generate answers from retrieved chunks
🎛️ Flexible Configuration: Customize temperature, max tokens, system prompts, and API keys
📋 Detailed Results: View ranked chunks with similarity scores, source locations, and generated answers with full transparency

Step 5: Code Export 💾

Generate production-ready Python code capturing your exact configuration:

📝 Complete Implementation Code: Exports working Python snippets for parsing, chunking, embedding, retrieval, and reranking
🎯 Exact Configuration Preservation: Every parameter and choice is captured in the generated code
📦 Dependencies Management: Complete requirements.txt with all necessary libraries and pinned versions
✂️ Copy-Paste Ready: Integration-ready code for immediate deployment to your application
✨ Supports All Features: Includes code for parsing options, chunking strategies, embedding models, retrieval methods, and LLM configuration

Storage

All data is stored locally in ~/.unravel/:

~/.unravel/
├── documents/          # Uploaded raw documents
├── chunks/             # Processed chunk data
├── embeddings/         # Cached embeddings
├── indices/            # Vector indices (Qdrant storage)
├── session_state.json  # UI state persistence
├── llm_config.json     # LLM configuration
└── rag_config.json     # RAG pipeline settings

No data is transmitted to external servers except when using LLM APIs for query generation.

Development

Setup

# Clone the repository
git clone https://github.com/unravel/unravel.git
cd unravel

# Create virtual environment
uv venv

# Activate virtual environment (Windows)
.venv\Scripts\activate
# Or macOS/Linux: source .venv/bin/activate

# Install in development mode with all dependencies
uv sync --all-extras

Running Tests

uv run pytest

Code Formatting

uv run black unravel
uv run ruff check unravel

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

Development Guidelines

See CLAUDE.md for coding standards and development philosophy.

License

MIT License - see LICENSE file for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Feb 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unravel_rag-0.1.0.tar.gz (717.7 kB view details)

Uploaded Feb 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

unravel_rag-0.1.0-py3-none-any.whl (130.1 kB view details)

Uploaded Feb 13, 2026 Python 3

File details

Details for the file unravel_rag-0.1.0.tar.gz.

File metadata

Download URL: unravel_rag-0.1.0.tar.gz
Upload date: Feb 13, 2026
Size: 717.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for unravel_rag-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`c8762d66cbc73abcab7720ee8f1ea232658300b6d51686fd9f958b938b1ea481`
MD5	`d2d179f34f5b3e7dfabddded494ae6df`
BLAKE2b-256	`895ba493cce4988f392876774080c44af1b0816d337599403eac9c94788fcbb0`

See more details on using hashes here.

File details

Details for the file unravel_rag-0.1.0-py3-none-any.whl.

File metadata

Download URL: unravel_rag-0.1.0-py3-none-any.whl
Upload date: Feb 13, 2026
Size: 130.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for unravel_rag-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`891ba87c80980e9df9eeb902650eae25a9c6f726e879329a1c59e0a27a2b0518`
MD5	`2621cd29425181930d6f9b3d074c9b2c`
BLAKE2b-256	`4f81fb20a9db86f0ec2d4116945a08e9c81e68ed514261585bf230085ad768b1`

See more details on using hashes here.

unravel-rag 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Unravel

Overview

Installation

From PyPI

From Source

Development Dependencies

Usage

API Key Setup

Running as a Python Module

Features

Step 1: Document Upload 📄

Step 2: Chunk Visualization ✂️

Step 3: Embedding Explorer 🧭

Step 4: Query Testing 🔍

Step 5: Code Export 💾

Storage

Development

Setup

Running Tests

Code Formatting

Contributing

Development Guidelines

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes