AI-powered research assistant with deep, iterative analysis using LLMs and web searches
Project description
Local Deep Research
AI-powered research assistant for deep, iterative research
Performs deep, iterative research using multiple LLMs and search engines with proper citations
▶️ Watch Review by The Art Of The Terminal🚀 What is Local Deep Research?
AI research assistant you control. Run locally for privacy, use any LLM and build your own searchable knowledge base. You own your data and see exactly how it works.
⚡ Quick Start
Option 1: Docker Run (Linux)
# Step 1: Pull and run Ollama
docker run -d -p 11434:11434 --name ollama ollama/ollama
docker exec ollama ollama pull gpt-oss:20b
# Step 2: Pull and run SearXNG for optimal search results
docker run -d -p 8080:8080 --name searxng searxng/searxng
# Step 3: Pull and run Local Deep Research
docker run -d -p 5000:5000 --network host \
--name local-deep-research \
--volume "deep-research:/data" \
-e LDR_DATA_DIR=/data \
localdeepresearch/local-deep-research
Option 2: Docker Compose (Recommended)
CPU-only (all platforms):
curl -O https://raw.githubusercontent.com/LearningCircuit/local-deep-research/main/docker-compose.yml && docker compose up -d
With NVIDIA GPU (Linux):
curl -O https://raw.githubusercontent.com/LearningCircuit/local-deep-research/main/docker-compose.yml && \
curl -O https://raw.githubusercontent.com/LearningCircuit/local-deep-research/main/docker-compose.gpu.override.yml && \
docker compose -f docker-compose.yml -f docker-compose.gpu.override.yml up -d
Open http://localhost:5000 after ~30 seconds. For GPU setup, environment variables, and more, see the Docker Compose Guide.
Option 3: pip install
pip install local-deep-research
Works on Windows, macOS, and Linux. SQLCipher encryption is included via pre-built wheels — no compilation needed. PDF export on Windows requires Pango (setup guide). If you encounter issues with encryption, set
export LDR_BOOTSTRAP_ALLOW_UNENCRYPTED=trueto use standard SQLite instead.
🏗️ How It Works
Research
You ask a complex question. LDR:
- Does the research for you automatically
- Searches across web, academic papers, and your own documents
- Synthesizes everything into a report with proper citations
Choose from 20+ research strategies for quick facts, deep analysis, or academic research.
New: LangGraph Agent Strategy — An autonomous agentic research mode where the LLM decides what to search, which specialized engines to use (arXiv, PubMed, Semantic Scholar, etc.), and when to synthesize. Early results are promising — it adaptively switches between search engines based on what it finds and collects significantly more sources than pipeline-based strategies. Select langgraph-agent in Settings to try it.
Build Your Knowledge Base
flowchart LR
R[Research] --> D[Download Sources]
D --> L[(Library)]
L --> I[Index & Embed]
I --> S[Search Your Docs]
S -.-> R
Every research session finds valuable sources. Download them directly into your encrypted library—academic papers from ArXiv, PubMed articles, web pages. LDR extracts text, indexes everything, and makes it searchable. Next time you research, ask questions across your own documents and the live web together. Your knowledge compounds over time.
🛡️ Security
flowchart LR
U1[User A] --> D1[(Encrypted DB)]
U2[User B] --> D2[(Encrypted DB)]
Your data stays yours. Each user gets their own isolated SQLCipher database encrypted with AES-256 (Signal-level security). No password recovery means true zero-knowledge—even server admins can't read your data. Run fully local with Ollama + SearXNG and nothing ever leaves your machine.
In-memory credentials: Like all applications that use secrets at runtime — including password managers, browsers, and API clients — credentials are held in plain text in process memory during active sessions. This is an industry-wide accepted reality, not specific to LDR: if an attacker can read process memory, they can also read any in-process decryption key. We mitigate this with session-scoped credential lifetimes and core dump exclusion. Ideas for further improvements are always welcome via GitHub Issues. See our Security Policy for details.
Supply Chain Security: Docker images are signed with Cosign, include SLSA provenance attestations, and attach SBOMs. Verify with:
cosign verify localdeepresearch/local-deep-research:latest
Security Transparency: Scanner suppressions are documented with justifications in Security Alerts Assessment, Scorecard Compliance, Container CVE Suppressions, and SAST Rule Rationale. Some alerts (Dependabot, code scanning) can only be dismissed or are very difficult to suppress outside the GitHub Security tab, so the files above do not cover every dismissed finding.
Detailed Architecture → | Security Policy → | Security Review Process →
🔒 Privacy & Data
Local Deep Research contains no telemetry, no analytics, and no tracking. We do not collect, transmit, or store any data about you or your usage. No analytics SDKs, no phone-home calls, no crash reporting, no external scripts. Usage metrics stay in your local encrypted database.
The only network calls LDR makes are ones you initiate: search queries (to engines you configure), LLM API calls (to your chosen provider), and notifications (only if you set up Apprise).
Since we don't collect any usage data, we rely on you to tell us what works, what's broken, and what you'd like to see next — bug reports, feature ideas, and even which features you love or never use all help us improve LDR.
📊 Performance
~95% accuracy on SimpleQA benchmark (preliminary results)
- Tested with GPT-4.1-mini + SearXNG + focused-iteration strategy
- Comparable to state-of-the-art AI research systems
- Local models can achieve similar performance with proper configuration
- Join our community benchmarking effort →
✨ Key Features
🔍 Research Modes
- Quick Summary - Get answers in 30 seconds to 3 minutes with citations
- Detailed Research - Comprehensive analysis with structured findings
- Report Generation - Professional reports with sections and table of contents
- Document Analysis - Search your private documents with AI
🛠️ Advanced Capabilities
- LangChain Integration - Use any vector store as a search engine
- REST API - Authenticated HTTP access with per-user databases
- Benchmarking - Test and optimize your configuration
- Analytics Dashboard - Track costs, performance, and usage metrics
- Real-time Updates - WebSocket support for live research progress
- Export Options - Download results as PDF or Markdown
- Research History - Save, search, and revisit past research
- Adaptive Rate Limiting - Intelligent retry system that learns optimal wait times
- Keyboard Shortcuts - Navigate efficiently (ESC, Ctrl+Shift+1-5)
- Per-User Encrypted Databases - Secure, isolated data storage for each user
📰 News & Research Subscriptions
- Automated Research Digests - Subscribe to topics and receive AI-powered research summaries
- Customizable Frequency - Daily, weekly, or custom schedules for research updates
- Smart Filtering - AI filters and summarizes only the most relevant developments
- Multi-format Delivery - Get updates as markdown reports or structured summaries
- Topic & Query Support - Track specific searches or broad research areas
🌐 Search Sources
Free Search Engines
- Academic: arXiv, PubMed, Semantic Scholar
- General: Wikipedia, SearXNG
- Technical: GitHub, Elasticsearch
- Historical: Wayback Machine
- News: The Guardian, Wikinews
Premium Search Engines
- Tavily - AI-powered search
- Google - Via SerpAPI or Programmable Search Engine
- Brave Search - Privacy-focused web search
Custom Sources
- Local Documents - Search your files with AI
- LangChain Retrievers - Any vector store or database
- Meta Search - Combine multiple engines intelligently
📦 Installation Options
For most users, the Quick Start above is all you need.
| Method | Best for | Guide |
|---|---|---|
| Docker Compose | Most users (recommended) | Docker Compose Guide |
| Docker | Minimal setup | Installation Guide |
| pip | Developers, Python integration | pip Guide |
| Unraid | Unraid servers | Unraid Guide |
💻 Usage Examples
Python API
from local_deep_research.api import LDRClient, quick_query
# Option 1: Simplest - one line research
summary = quick_query("username", "password", "What is quantum computing?")
print(summary)
# Option 2: Client for multiple operations
client = LDRClient()
client.login("username", "password")
result = client.quick_research("What are the latest advances in quantum computing?")
print(result["summary"])
HTTP API
The code example below shows the basic API structure - for working examples, see the link below
import requests
from bs4 import BeautifulSoup
# Create session and authenticate
session = requests.Session()
login_page = session.get("http://localhost:5000/auth/login")
soup = BeautifulSoup(login_page.text, "html.parser")
login_csrf = soup.find("input", {"name": "csrf_token"}).get("value")
# Login and get API CSRF token
session.post("http://localhost:5000/auth/login",
data={"username": "user", "password": "pass", "csrf_token": login_csrf})
csrf = session.get("http://localhost:5000/auth/csrf-token").json()["csrf_token"]
# Make API request
response = session.post("http://localhost:5000/api/start_research",
json={"query": "Your research question"},
headers={"X-CSRF-Token": csrf})
🚀 Ready-to-use HTTP API Examples → examples/api_usage/http/
- ✅ Automatic user creation - works out of the box
- ✅ Complete authentication with CSRF handling
- ✅ Result retry logic - waits until research completes
- ✅ Progress monitoring and error handling
Command Line Tools
# Run benchmarks from CLI
python -m local_deep_research.benchmarks --dataset simpleqa --examples 50
# Manage rate limiting
python -m local_deep_research.web_search_engines.rate_limiting status
python -m local_deep_research.web_search_engines.rate_limiting reset
🔗 Enterprise Integration
Connect LDR to your existing knowledge base:
from local_deep_research.api import quick_summary
# Use your existing LangChain retriever
result = quick_summary(
query="What are our deployment procedures?",
retrievers={"company_kb": your_retriever},
search_tool="company_kb"
)
Works with: FAISS, Chroma, Pinecone, Weaviate, Elasticsearch, and any LangChain-compatible retriever.
🔌 MCP Server (Claude Integration)
LDR provides an MCP (Model Context Protocol) server that allows AI assistants like Claude Desktop and Claude Code to perform deep research.
⚠️ Security Note: This MCP server is designed for local use only via STDIO transport (e.g., Claude Desktop). It has no built-in authentication or rate limiting. Do not expose over a network without implementing proper security controls. See the MCP Security Guide for network deployment requirements.
Installation
# Install with MCP extras
pip install "local-deep-research[mcp]"
Claude Desktop Configuration
Add to your claude_desktop_config.json:
{
"mcpServers": {
"local-deep-research": {
"command": "ldr-mcp",
"env": {
"LDR_LLM_PROVIDER": "openai",
"LDR_LLM_OPENAI_API_KEY": "sk-..."
}
}
}
}
Claude Code Configuration
Add to your .mcp.json (project-level) or ~/.claude/mcp.json (global):
{
"mcpServers": {
"local-deep-research": {
"command": "ldr-mcp",
"env": {
"LDR_LLM_PROVIDER": "ollama",
"LDR_LLM_OLLAMA_URL": "http://localhost:11434"
}
}
}
}
Available Tools
| Tool | Description | Duration | LLM Cost |
|---|---|---|---|
search |
Raw results from a specific engine (arxiv, pubmed, wikipedia, ...) | 5-30s | None |
quick_research |
Fast research summary | 1-5 min | Yes |
detailed_research |
Comprehensive analysis | 5-15 min | Yes |
generate_report |
Full markdown report | 10-30 min | Yes |
analyze_documents |
Search local collections | 30s-2 min | Yes |
list_search_engines |
List available search engines | instant | None |
list_strategies |
List research strategies | instant | None |
get_configuration |
Get current config | instant | None |
Individual Search Engines
The search tool lets you query specific search engines directly and get raw results (title, link, snippet) — no LLM processing, no cost, fast. This is especially useful for monitoring and subscriptions where you want to check for new content regularly without burning LLM tokens.
# Search arXiv for recent papers
search(query="transformer architecture improvements", engine="arxiv")
# Search PubMed for medical literature
search(query="CRISPR clinical trials 2024", engine="pubmed")
# Search Wikipedia for quick facts
search(query="quantum error correction", engine="wikipedia")
# Search OpenClaw for legal case law
search(query="copyright fair use precedents", engine="openclaw")
# Use list_search_engines() to see all available engines
Example Usage
"Use quick_research to find information about quantum computing applications"
"Search arxiv for recent papers on diffusion models"
"Generate a detailed research report on renewable energy trends"
📊 Performance & Analytics
Benchmark Results
Early experiments on small SimpleQA dataset samples:
| Configuration | Accuracy | Notes |
|---|---|---|
| gpt-4.1-mini + SearXNG + focused_iteration | 90-95% | Limited sample size |
| gpt-4.1-mini + Tavily + focused_iteration | 90-95% | Limited sample size |
| gemini-2.0-flash-001 + SearXNG | 82% | Single test run |
Note: These are preliminary results from initial testing. Performance varies significantly based on query types, model versions, and configurations. Run your own benchmarks →
Built-in Analytics Dashboard
Track costs, performance, and usage with detailed metrics. Learn more →
🤖 Supported LLMs
Local Models (via Ollama)
- Llama 3, Mistral, Gemma, DeepSeek
- LLM processing stays local (search queries still go to web)
- No API costs
Cloud Models
- OpenAI (GPT-4, GPT-3.5)
- Anthropic (Claude 3)
- Google (Gemini)
- 100+ models via OpenRouter
📚 Documentation
Getting Started
- Installation Guide
- Frequently Asked Questions
- API Quickstart
- Configuration Guide
- Full Configuration Reference
Core Features
Advanced Features
Development
Examples & Tutorials
📰 Featured In
"Local Deep Research deserves special mention for those who prioritize privacy... tuned to use open-source LLMs that can run on consumer GPUs or even CPUs. Journalists, researchers, or companies with sensitive topics can investigate information without queries ever hitting an external server."
News & Articles
- Korben.info - French tech blog ("Sherlock Holmes numérique")
- Roboto.fr - "L'alternative open-source gratuite à Deep Research d'OpenAI"
- KDJingPai AI Tools - AI productivity tools coverage
- AI Sharing Circle - AI resources coverage
Community Discussions
- Hacker News - 190+ points, community discussion
- LangChain Twitter/X - Official LangChain promotion
- LangChain LinkedIn - 400+ likes
International Coverage
🇨🇳 Chinese
- Juejin (掘金) - Developer community
- Cnblogs (博客园) - Developer blogs
- GitHubDaily (Twitter/X) - Influential tech account
- Zhihu (知乎) - Tech community
- A姐分享 - AI resources
- CSDN - Installation guide
- NetEase (网易) - Tech news portal
🇯🇵 Japanese
- note.com: 調査革命:Local Deep Research徹底活用法 - Comprehensive tutorial
- Qiita: Local Deep Researchを試す - Docker setup guide
- LangChainJP (Twitter/X) - Japanese LangChain community
🇰🇷 Korean
- PyTorch Korea Forum - Korean ML community
- GeekNews (Hada.io) - Korean tech news
Reviews & Analysis
- BSAIL Lab: How useful is Deep Research in Academia? - Academic review by contributor @djpetti
- The Art Of The Terminal: Use Local LLMs Already! - Comprehensive review of local AI tools, featuring LDR's research capabilities (embeddings now work!)
Related Projects
- SearXNG LDR-Academic - Academic-focused SearXNG fork with 12 research engines (arXiv, Google Scholar, PubMed, etc.) designed for LDR
- DeepWiki Documentation - Third-party documentation and guides
Note: Third-party projects and articles are independently maintained. We link to them as useful resources but cannot guarantee their code quality or security.
🤝 Community & Support
- Discord - Get help and share research techniques
- Reddit - Updates and showcases
- GitHub Issues - Bug reports
🚀 Contributing
We welcome contributions of all sizes — from typo fixes to new features. The key rule: keep PRs small and atomic (one change per PR). For larger changes, please open an issue or start a discussion first — we want to protect your time and make sure your effort leads to a successful merge rather than a misaligned PR. See our Contributing Guide to get started.
📄 License
MIT License - see LICENSE file.
Dependencies: All third-party packages use permissive licenses (MIT, Apache-2.0, BSD, etc.) - see allowlist
Built with: LangChain, Ollama, SearXNG, FAISS
Support Free Knowledge: Consider donating to Wikipedia, arXiv, or PubMed.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file local_deep_research-1.5.3.tar.gz.
File metadata
- Download URL: local_deep_research-1.5.3.tar.gz
- Upload date:
- Size: 7.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2c37e80a50ee0a9173a1a3cd9a77c0463cf4beec5b169f2faa314257de25c196
|
|
| MD5 |
8e35746a000bbc0041da2387dc1beb39
|
|
| BLAKE2b-256 |
7e5edb1bb52e805549998656c8c9ad2fcc6c9a73fb5b63bd72753a260d9a2b95
|
Provenance
The following attestation bundles were made for local_deep_research-1.5.3.tar.gz:
Publisher:
publish.yml on LearningCircuit/local-deep-research
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
local_deep_research-1.5.3.tar.gz -
Subject digest:
2c37e80a50ee0a9173a1a3cd9a77c0463cf4beec5b169f2faa314257de25c196 - Sigstore transparency entry: 1208932311
- Sigstore integration time:
-
Permalink:
LearningCircuit/local-deep-research@198969cd4d41c8bfb898ec22cfa474aff6c0cd93 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/LearningCircuit
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@198969cd4d41c8bfb898ec22cfa474aff6c0cd93 -
Trigger Event:
repository_dispatch
-
Statement type:
File details
Details for the file local_deep_research-1.5.3-py3-none-any.whl.
File metadata
- Download URL: local_deep_research-1.5.3-py3-none-any.whl
- Upload date:
- Size: 3.5 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3efcccfabb807ece0b027622e5d275c74579917c75735f43ea800de13621f86c
|
|
| MD5 |
eb881ef8e77f5938992c2b97a911c635
|
|
| BLAKE2b-256 |
de5a0c9c4e5c3bb2248400476f4b140e9647e094fc2e73199439c79494e6aa48
|
Provenance
The following attestation bundles were made for local_deep_research-1.5.3-py3-none-any.whl:
Publisher:
publish.yml on LearningCircuit/local-deep-research
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
local_deep_research-1.5.3-py3-none-any.whl -
Subject digest:
3efcccfabb807ece0b027622e5d275c74579917c75735f43ea800de13621f86c - Sigstore transparency entry: 1208932348
- Sigstore integration time:
-
Permalink:
LearningCircuit/local-deep-research@198969cd4d41c8bfb898ec22cfa474aff6c0cd93 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/LearningCircuit
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@198969cd4d41c8bfb898ec22cfa474aff6c0cd93 -
Trigger Event:
repository_dispatch
-
Statement type: