Recursive Language Model with REPL Inference Strategy - Unlimited context management using SQL-based retrieval
Project description
RLM-REPL
Recursive Language Model with REPL Inference Strategy
A Python library that enables any language model to manage unlimited context using SQL-based retrieval with DuckDB.
Overview
RLM-REPL implements a human-like reading strategy for processing large documents:
- Overview - Read the beginning to understand document structure
- Search - Find relevant sections using keyword search
- Deep Read - Extract detailed information from located sections
- Synthesize - Combine findings into a comprehensive answer
This approach allows small context window models to effectively work with documents of any size.
Features
- Two-sided architecture: Separate Data and Inference layers
- In-memory default: Fast DuckDB in-memory database (no setup required)
- Persistent option: Optional persistent database for caching
- CLI tool: Instant testing from command line
- Python API: Full programmatic control
- Streaming events: Real-time progress tracking
- Configurable verbosity: Control output detail level
- OpenAI-compatible: Works with any OpenAI-compatible API
Installation
pip install rlm-repl
Or install from source:
git clone https://github.com/labKnowledge/rlm-repl-sql.git
cd rlm-repl-sql
pip install -e .
Quick Start
CLI Usage
# Interactive mode
rlm-repl document.txt
# With custom model
rlm-repl document.txt --base-url http://localhost:11434/v1 --model qwen3-coder
# Single question mode
rlm-repl document.txt --question "What is the main topic?"
# Quiet mode
rlm-repl document.txt -q --question "Summarize the document"
Python API
from rlm_repl import RLMREPL, RLMConfig
# Configure for Ollama (local)
config = RLMConfig(
base_url="http://localhost:11434/v1",
api_key="ollama",
model="qwen3-coder",
)
# Create REPL and load document
with RLMREPL(config) as repl:
repl.load_document("large_book.txt")
result = repl.ask("What are the main themes?")
print(result.answer)
print(f"Read {result.total_words} words in {result.elapsed_time:.1f}s")
Documentation
Comprehensive documentation is available in the docs/ directory:
- Getting Started - Installation, setup, and first steps
- API Reference - Complete API documentation
- Configuration - All configuration options
- Examples - Detailed usage examples
- Architecture - How the system works
- Troubleshooting - Common issues and solutions
Supported Models
Any OpenAI-compatible API:
- Ollama (local): llama3, qwen3, mistral, etc.
- OpenAI: gpt-4, gpt-3.5-turbo
- vLLM: Any hosted model
- LMStudio: Local models
- Together AI, Groq, etc.
Examples
See the examples/ directory for complete examples:
basic_usage.py- Simple document Q&Astreaming_events.py- Real-time progress trackingpersistent_database.py- Caching documentsapi_usage.py- Building applications with RLM-REPL
How It Works
- Document Loading: Text is parsed into lines with metadata (headers, code blocks, list items)
- SQL Storage: Lines are stored in DuckDB with indexes for efficient querying
- Reading Strategy: LLM decides what to read using SQL queries
- Iterative Reading: Multiple passes gather relevant information
- Answer Synthesis: Final answer is generated from gathered context
Reading Modes
- overview: Read document beginning (lines 1-100)
- search: Find keywords with
LIKE '%term%' - read: Focused reading (20-50 lines)
- deep_read: Detailed analysis (50-100 lines)
Development
# Install with dev dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Format code
ruff format rlm_repl
# Type checking
mypy rlm_repl
License
MIT License - see LICENSE file for details.
Contributing
Contributions welcome! Please open an issue or submit a pull request.
Background & History
RLM-REPL was created by Remy Gakwaya after reading the MIT paper on Recursive Language Models. The initial implementation attempted to use a REPL approach where the LLM would generate Python functions to process documents. However, this approach proved challenging, especially with smaller language models that struggled to create complex Python functions reliably.
After hundreds of iterations and experiments, Remy developed the RLM-REPL v8 concept - a human-like reading strategy specifically designed to work with local, smaller language models on limited computational resources. The philosophy was simple: if it can work reliably with poor and small models in limited computation, it would perform exceptionally well when powered by leading LLMs.
The library evolved to use SQL-based retrieval instead of LLM-generated Python functions, leveraging DuckDB for efficient document storage and querying. This approach:
- Works reliably with models of all sizes, from small local models to leading cloud-based LLMs
- Provides a structured, predictable interface (SQL) that even smaller models can handle
- Enables efficient querying with database indexes
- Implements the human-like reading strategy (overview → search → deep read → synthesize) developed in v8
Acknowledgments
Author: Remy Gakwaya
Inspiration: Based on the MIT paper on Recursive Language Models
Innovation: The RLM-REPL v8 concept - human-like reading strategies for LLM document processing - was developed by Remy after extensive experimentation (hundreds of iterations) to create a solution that works reliably with local, smaller language models.
Evolution: This implementation uses SQL-based retrieval instead of LLM-generated Python functions, making it more reliable and accessible for smaller language models while maintaining the proven v8 reading strategy.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rlm_repl-0.1.0.tar.gz.
File metadata
- Download URL: rlm_repl-0.1.0.tar.gz
- Upload date:
- Size: 1.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8d82469c6e38b1b429ab6e580de8f6a201fde80a1f6f0365caf93003d65107a5
|
|
| MD5 |
bba37c8099ba9cdce00eba6812cabc22
|
|
| BLAKE2b-256 |
82deb26d1b92df3bff7d86ee3145c695760eb92e6b59c566f354365d4dea62d9
|
File details
Details for the file rlm_repl-0.1.0-py3-none-any.whl.
File metadata
- Download URL: rlm_repl-0.1.0-py3-none-any.whl
- Upload date:
- Size: 31.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c7c084550819f093c8c5662d68789efab636eae26b17ee31a25152f7d46fff9f
|
|
| MD5 |
2ff17de25ad3415eb71e134b6d18e607
|
|
| BLAKE2b-256 |
bcdab21ef141b3a64325138404f51539f025a7bb23f5e6e1a1a8d66ee1da3964
|