Recursive Language Model with REPL Inference Strategy - Unlimited context management using SQL-based retrieval

These details have not been verified by PyPI

Project links

Project description

RLM-REPL

Recursive Language Model with REPL Inference Strategy

A Python library that enables any language model to manage unlimited context using SQL-based retrieval with DuckDB.

Overview

RLM-REPL implements a human-like reading strategy for processing large documents:

Overview - Read the beginning to understand document structure
Search - Find relevant sections using keyword search
Deep Read - Extract detailed information from located sections
Synthesize - Combine findings into a comprehensive answer

This approach allows small context window models to effectively work with documents of any size.

Features

Two-sided architecture: Separate Data and Inference layers
In-memory default: Fast DuckDB in-memory database (no setup required)
Persistent option: Optional persistent database for caching
CLI tool: Instant testing from command line
Python API: Full programmatic control
Streaming events: Real-time progress tracking
Configurable verbosity: Control output detail level
OpenAI-compatible: Works with any OpenAI-compatible API

Installation

pip install rlm-repl

Or install from source:

git clone https://github.com/labKnowledge/rlm-repl-sql.git
cd rlm-repl-sql
pip install -e .

Quick Start

CLI Usage

# Interactive mode
rlm-repl document.txt

# With custom model
rlm-repl document.txt --base-url http://localhost:11434/v1 --model qwen3-coder

# Single question mode
rlm-repl document.txt --question "What is the main topic?"

# Quiet mode
rlm-repl document.txt -q --question "Summarize the document"

Python API

from rlm_repl import RLMREPL, RLMConfig

# Configure for Ollama (local)
config = RLMConfig(
    base_url="http://localhost:11434/v1",
    api_key="ollama",
    model="qwen3-coder",
)

# Create REPL and load document
with RLMREPL(config) as repl:
    repl.load_document("large_book.txt")
    
    result = repl.ask("What are the main themes?")
    print(result.answer)
    print(f"Read {result.total_words} words in {result.elapsed_time:.1f}s")

Documentation

Comprehensive documentation is available in the docs/ directory:

Getting Started - Installation, setup, and first steps
API Reference - Complete API documentation
Configuration - All configuration options
Examples - Detailed usage examples
Architecture - How the system works
Troubleshooting - Common issues and solutions

Supported Models

Any OpenAI-compatible API:

Ollama (local): llama3, qwen3, mistral, etc.
OpenAI: gpt-4, gpt-3.5-turbo
vLLM: Any hosted model
LMStudio: Local models
Together AI, Groq, etc.

Examples

See the examples/ directory for complete examples:

basic_usage.py - Simple document Q&A
streaming_events.py - Real-time progress tracking
persistent_database.py - Caching documents
api_usage.py - Building applications with RLM-REPL

How It Works

Document Loading: Text is parsed into lines with metadata (headers, code blocks, list items)
SQL Storage: Lines are stored in DuckDB with indexes for efficient querying
Reading Strategy: LLM decides what to read using SQL queries
Iterative Reading: Multiple passes gather relevant information
Answer Synthesis: Final answer is generated from gathered context

Reading Modes

overview: Read document beginning (lines 1-100)
search: Find keywords with LIKE '%term%'
read: Focused reading (20-50 lines)
deep_read: Detailed analysis (50-100 lines)

Development

# Install with dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Format code
ruff format rlm_repl

# Type checking
mypy rlm_repl

License

MIT License - see LICENSE file for details.

Contributing

Contributions welcome! Please open an issue or submit a pull request.

Background & History

RLM-REPL was created by Remy Gakwaya after reading the MIT paper on Recursive Language Models. The initial implementation attempted to use a REPL approach where the LLM would generate Python functions to process documents. However, this approach proved challenging, especially with smaller language models that struggled to create complex Python functions reliably.

After hundreds of iterations and experiments, Remy developed the RLM-REPL v8 concept - a human-like reading strategy specifically designed to work with local, smaller language models on limited computational resources. The philosophy was simple: if it can work reliably with poor and small models in limited computation, it would perform exceptionally well when powered by leading LLMs.

The library evolved to use SQL-based retrieval instead of LLM-generated Python functions, leveraging DuckDB for efficient document storage and querying. This approach:

Works reliably with models of all sizes, from small local models to leading cloud-based LLMs
Provides a structured, predictable interface (SQL) that even smaller models can handle
Enables efficient querying with database indexes
Implements the human-like reading strategy (overview → search → deep read → synthesize) developed in v8

Acknowledgments

Author: Remy Gakwaya

Inspiration: Based on the MIT paper on Recursive Language Models

Innovation: The RLM-REPL v8 concept - human-like reading strategies for LLM document processing - was developed by Remy after extensive experimentation (hundreds of iterations) to create a solution that works reliably with local, smaller language models.

Evolution: This implementation uses SQL-based retrieval instead of LLM-generated Python functions, making it more reliable and accessible for smaller language models while maintaining the proven v8 reading strategy.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Jan 29, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rlm_repl-0.1.0.tar.gz (1.4 MB view details)

Uploaded Jan 29, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rlm_repl-0.1.0-py3-none-any.whl (31.7 kB view details)

Uploaded Jan 29, 2026 Python 3

File details

Details for the file rlm_repl-0.1.0.tar.gz.

File metadata

Download URL: rlm_repl-0.1.0.tar.gz
Upload date: Jan 29, 2026
Size: 1.4 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.18

File hashes

Hashes for rlm_repl-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`8d82469c6e38b1b429ab6e580de8f6a201fde80a1f6f0365caf93003d65107a5`
MD5	`bba37c8099ba9cdce00eba6812cabc22`
BLAKE2b-256	`82deb26d1b92df3bff7d86ee3145c695760eb92e6b59c566f354365d4dea62d9`

See more details on using hashes here.

File details

Details for the file rlm_repl-0.1.0-py3-none-any.whl.

File metadata

Download URL: rlm_repl-0.1.0-py3-none-any.whl
Upload date: Jan 29, 2026
Size: 31.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.18

File hashes

Hashes for rlm_repl-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c7c084550819f093c8c5662d68789efab636eae26b17ee31a25152f7d46fff9f`
MD5	`2ff17de25ad3415eb71e134b6d18e607`
BLAKE2b-256	`bcdab21ef141b3a64325138404f51539f025a7bb23f5e6e1a1a8d66ee1da3964`

See more details on using hashes here.

rlm-repl 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

RLM-REPL

Overview

Features

Installation

Quick Start

CLI Usage

Python API

Documentation

Supported Models

Examples

How It Works

Reading Modes

Development

License

Contributing

Background & History

Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes