Recursive reasoning engine for AI agents and vector databases, powered by RLM.
Project description
DeepRecall
Recursive reasoning over your data. Plug into any vector DB or agent framework.
Standard RAG retrieves documents once and stuffs them into a prompt. DeepRecall uses MIT's Recursive Language Models to let your LLM search, reason, search again, and repeat -- until it actually has enough information to answer properly.
The LLM gets a search_db() function injected into a sandboxed Python REPL. It decides what to search for, analyzes results with code, refines its queries based on what it found, and synthesizes a final answer. This is not a fixed pipeline -- the LLM drives the retrieval strategy.
Install
pip install deeprecall[chroma] # ChromaDB (local, zero-config)
pip install deeprecall[milvus] # Milvus
pip install deeprecall[qdrant] # Qdrant
pip install deeprecall[pinecone] # Pinecone
pip install deeprecall[all] # Everything
Usage
from deeprecall import DeepRecall
from deeprecall.vectorstores import ChromaStore
store = ChromaStore(collection_name="my_docs")
store.add_documents(["doc 1 text...", "doc 2 text...", "doc 3 text..."])
engine = DeepRecall(
vectorstore=store,
backend="openai",
backend_kwargs={"model_name": "gpt-4o-mini", "api_key": "sk-..."},
)
result = engine.query("What are the key themes across these documents?")
print(result.answer)
print(result.sources)
print(result.execution_time)
What happens when you call .query()
- A lightweight HTTP server wraps your vector store on a random port
- A
search_db(query, top_k)function is injected into the RLM's sandboxed REPL - The LLM enters a recursive loop -- it can search, write Python, call sub-LLMs, and search again
- When it has enough info, it returns a
FINAL()answer - You get back the answer, every source document accessed, token usage, and execution time
No modifications to RLM. It's used as a pip dependency. The bridge is setup_code + a custom system prompt.
Vector Stores
| Store | Install | Needs embedding_fn? |
|---|---|---|
| ChromaDB | deeprecall[chroma] |
No (built-in) |
| Milvus | deeprecall[milvus] |
Yes |
| Qdrant | deeprecall[qdrant] |
Yes |
| Pinecone | deeprecall[pinecone] |
Yes |
from deeprecall.vectorstores import ChromaStore, MilvusStore, QdrantStore, PineconeStore
# ChromaDB -- no embedding function needed
store = ChromaStore(collection_name="docs")
# Milvus / Qdrant / Pinecone -- pass your own embedding function
store = MilvusStore(collection_name="docs", uri="http://localhost:19530", embedding_fn=my_fn)
All stores implement the same interface: add_documents(), search(), delete(), count().
Framework Adapters
LangChain
from deeprecall.adapters.langchain import DeepRecallRetriever, DeepRecallChatModel
retriever = DeepRecallRetriever(engine=engine)
docs = retriever.invoke("question")
llm = DeepRecallChatModel(engine=engine)
response = llm.invoke("question")
LlamaIndex
from deeprecall.adapters.llamaindex import DeepRecallQueryEngine
query_engine = DeepRecallQueryEngine(engine=engine)
response = query_engine.query("question")
OpenAI-compatible API -- works with any client that speaks the OpenAI protocol:
deeprecall serve --vectorstore chroma --collection my_docs --port 8000
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000/v1", api_key="not-needed")
response = client.chat.completions.create(
model="deeprecall",
messages=[{"role": "user", "content": "question"}],
)
CLI
deeprecall ingest --path ./documents/ --vectorstore chroma --collection my_docs
deeprecall query "What is the conclusion?" --vectorstore chroma --collection my_docs
deeprecall serve --vectorstore chroma --collection my_docs --port 8000
Configuration
from deeprecall import DeepRecall, DeepRecallConfig
config = DeepRecallConfig(
backend="openai",
backend_kwargs={"model_name": "gpt-4o", "api_key": "sk-..."},
max_iterations=15,
max_depth=1,
top_k=5,
verbose=True,
)
engine = DeepRecall(vectorstore=store, config=config)
Supported backends: openai, anthropic, azure_openai, gemini, vllm, litellm, portkey, openrouter.
Project Structure
deeprecall/
├── core/ # Engine, config, types, search server
├── vectorstores/ # ChromaDB, Milvus, Qdrant, Pinecone adapters
├── adapters/ # LangChain, LlamaIndex, OpenAI-compatible server
├── prompts/ # System prompts for the RLM
└── cli.py # CLI entry point
Contributing
git clone https://github.com/kothapavan/deeprecall.git
cd deeprecall
pip install -e ".[all,dev,test]"
make check
See CONTRIBUTING.md.
Citation
Built on Recursive Language Models by Zhang, Kraska, and Khattab (MIT).
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file deeprecall-0.1.0.tar.gz.
File metadata
- Download URL: deeprecall-0.1.0.tar.gz
- Upload date:
- Size: 27.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
73588bc37a305208335646c4b07aefa869ddfac731ba1e32d8d72db6f9edcc8e
|
|
| MD5 |
415cdb6edf6d41306823076d87a10ef6
|
|
| BLAKE2b-256 |
a70828f7d83f2f25fb79120672126d702a5e2c717d2e363b51c8099b2f6740b2
|
File details
Details for the file deeprecall-0.1.0-py3-none-any.whl.
File metadata
- Download URL: deeprecall-0.1.0-py3-none-any.whl
- Upload date:
- Size: 31.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e7a0effede5640960461406b10487ff42918a439d53a391bd02910bc279a21b4
|
|
| MD5 |
5ca34bcf6579c498ba19b9b36837d543
|
|
| BLAKE2b-256 |
6c71248f249d264174cffa8498336319635fb724f80842faaa07599d6b7852b9
|