FLAMEHAVEN FileSearch - Open source semantic document search with multi-provider LLM support (Gemini, OpenAI, Claude, Ollama)
Project description
FLAMEHAVEN FileSearch
Self-hosted RAG search engine. Production-ready in 3 minutes.
Quick Start • Features • Documentation • API Reference • Contributing
🎯 Why FLAMEHAVEN FileSearch?
Stop sending your sensitive documents to third-party services. FLAMEHAVEN FileSearch is a production-grade RAG search engine — BM25+hybrid retrieval, 34 file formats, multi-LLM (Gemini, OpenAI, Claude, Ollama) — running self-hosted in minutes, not days.
# Gemini (cloud) — one command, three minutes
docker run -d -p 8000:8000 -e GEMINI_API_KEY="your_key" flamehaven-filesearch:1.6.1
# Ollama — fully local, zero API cost (Gemma, Llama, Mistral, Qwen, Phi …)
# Step 1: pull a model → ollama pull gemma4:27b
docker run -d -p 8000:8000 \
-e LLM_PROVIDER=ollama \
-e LOCAL_MODEL=gemma4:27b \
-e OLLAMA_BASE_URL=http://host.docker.internal:11434 \
flamehaven-filesearch:1.6.1
🚀 FastProduction deployment in 3 minutes |
🔒 Private100% self-hosted |
💰 Cost-EffectiveFree tier: 1,500 queries/month |
Features ✨
Core Capabilities
| Capability | Detail |
|---|---|
| Search Modes | Keyword, semantic, and hybrid (BM25+RRF) with automatic typo correction |
| Quality Gate | Confidence-scored hybrid results (PASS/FORGE/INHIBIT). FORGE augments with keyword fallback; INHIBIT flags low_confidence. Self-adapting BM25 pool via EMA meta-learner. Zero new dependencies. |
| 34 File Formats | PDF, DOCX/DOC, XLSX, PPTX, RTF, HTML, CSV, LaTeX, WebVTT, images + plain text — see Document Parsing |
| RAG Pipeline | Structure-aware chunking, KnowledgeAtom 2-level indexing, sliding-window context enrichment, mtime parse cache |
| Ultra-Fast Vectors | DSP v2.0 generates embeddings in <1ms — no ML frameworks required |
| Source Attribution | Every answer links back to the originating document and chunk |
| Framework SDKs | LangChain, LlamaIndex, Haystack, CrewAI adapters out of the box |
| Enterprise Auth | API key hashing (SHA256+salt), OAuth2/OIDC, fine-grained permissions |
| Admin Dashboard | Real-time metrics, quota management, batch processing (1–100 queries) |
| Flexible Storage | SQLite (default) · PostgreSQL + pgvector · Redis cache (optional) |
What changed in each release? See CHANGELOG.md for the full version history.
Quick Start 🚀
Option 1: Docker (Recommended)
The fastest path to production:
docker run -d \
-p 8000:8000 \
-e GEMINI_API_KEY="your_gemini_api_key" \
-e FLAMEHAVEN_ADMIN_KEY="secure_admin_password" \
-v $(pwd)/data:/app/data \
flamehaven-filesearch:1.6.1
✅ Server running at http://localhost:8000
Option 2: Python SDK
Perfect for integrating into existing applications:
from flamehaven_filesearch import FlamehavenFileSearch, FileSearchConfig
# Initialize
config = FileSearchConfig(google_api_key="your_gemini_key")
fs = FlamehavenFileSearch(config)
# Upload and search
fs.upload_file("company_handbook.pdf", store="docs")
result = fs.search("What is our remote work policy?", store="docs")
print(result['answer'])
# Output: "Employees can work remotely up to 3 days per week..."
Option 3: REST API
For language-agnostic integration:
# 1. Generate API key
curl -X POST http://localhost:8000/api/admin/keys \
-H "X-Admin-Key: your_admin_key" \
-d '{"name":"production","permissions":["upload","search"]}'
# 2. Upload document
curl -X POST http://localhost:8000/api/upload/single \
-H "Authorization: Bearer sk_live_abc123..." \
-F "file=@document.pdf" \
-F "store=my_docs"
# 3. Search
curl -X POST http://localhost:8000/api/search \
-H "Authorization: Bearer sk_live_abc123..." \
-H "Content-Type: application/json" \
-d
'{
"query": "What are the main findings?",
"store": "my_docs",
"search_mode": "hybrid"
}'
📦 Installation
# Core package (HTML, CSV, LaTeX, WebVTT, plain-text parsing included — zero extra deps)
pip install flamehaven-filesearch
# + Document parsers: PDF (pymupdf/pypdf), DOCX, XLSX, PPTX, RTF
pip install flamehaven-filesearch[parsers]
# + Image OCR (Pillow + pytesseract; requires Tesseract system binary)
pip install flamehaven-filesearch[vision]
# + Google Gemini API
pip install flamehaven-filesearch[google]
# + REST API server (FastAPI + uvicorn)
pip install flamehaven-filesearch[api]
# + HNSW vector index
pip install flamehaven-filesearch[vector]
# + PostgreSQL backend
pip install flamehaven-filesearch[postgres]
# Everything
pip install flamehaven-filesearch[all]
# Build from source
git clone https://github.com/flamehaven01/Flamehaven-Filesearch.git
cd Flamehaven-Filesearch
docker build -t flamehaven-filesearch:1.6.1 .
Framework Integrations
Framework SDKs (LangChain, LlamaIndex, etc.) are imported lazily — install only what you need:
# LangChain (pip install langchain-core)
from flamehaven_filesearch.integrations import FlamehavenLangChainLoader
docs = FlamehavenLangChainLoader("report.pdf", chunk=True).load()
# LlamaIndex (pip install llama-index-core)
from flamehaven_filesearch.integrations import FlamehavenLlamaIndexReader
nodes = FlamehavenLlamaIndexReader(chunk=True).load_data(["report.pdf", "slides.pptx"])
# Haystack (pip install haystack-ai)
from flamehaven_filesearch.integrations import FlamehavenHaystackConverter
result = FlamehavenHaystackConverter().run(sources=["report.pdf"])
# CrewAI (pip install crewai)
from flamehaven_filesearch.integrations import FlamehavenCrewAITool
tool = FlamehavenCrewAITool() # pass to your agent's tools list
Configuration ⚙️
LLM Provider Selection
FLAMEHAVEN supports four LLM backends — switch with a single env var:
LLM_PROVIDER |
Required variables | Notes |
|---|---|---|
gemini (default) |
GEMINI_API_KEY |
Google Gemini file-search API |
ollama |
LOCAL_MODEL, OLLAMA_BASE_URL |
Local inference via Ollama — Gemma 4/3, Llama 3.2, Qwen 2.5, Mistral, Phi-4 … |
openai |
OPENAI_API_KEY |
OpenAI or any OpenAI-compatible endpoint |
anthropic |
ANTHROPIC_API_KEY |
Anthropic Claude |
openai_compatible |
OPENAI_API_KEY, OPENAI_BASE_URL |
vLLM, LM Studio, Kimi, etc. |
# Gemini (default)
export GEMINI_API_KEY="your_google_gemini_api_key"
# Ollama (fully local)
export LLM_PROVIDER=ollama
export LOCAL_MODEL=gemma4:27b # or gemma4:4b, qwen2.5:7b, llama3.2 …
export OLLAMA_BASE_URL=http://localhost:11434
# OpenAI
export LLM_PROVIDER=openai
export OPENAI_API_KEY="sk-..."
export DEFAULT_MODEL=gpt-4o-mini # optional override
# Anthropic
export LLM_PROVIDER=anthropic
export ANTHROPIC_API_KEY="sk-ant-..."
Required Environment Variables
export FLAMEHAVEN_ADMIN_KEY="your_secure_admin_password"
# Plus the provider credentials above (at least one provider)
Optional Configuration
export HOST="0.0.0.0" # Bind address
export PORT="8000" # Server port
export REDIS_HOST="localhost" # Distributed caching
export REDIS_PORT="6379" # Redis port
export MAX_OUTPUT_TOKENS="1024" # Max answer tokens
export TEMPERATURE="0.5" # Model temperature (0.0–1.0)
export MAX_SOURCES="5" # Max source documents per answer
Advanced Configuration
Create a config.yaml for fine-tuned control:
vector_store:
quantization: int8
compression: gravitas_pack
search:
default_mode: hybrid
typo_correction: true
max_results: 10
security:
rate_limit: 100 # requests per minute
max_file_size: 52428800 # 50MB
📊 Performance
| Metric | Value | Notes |
|---|---|---|
| Vector Generation | <1ms |
DSP v2.0, zero ML dependencies |
| Memory Footprint | 75% reduced |
Int8 quantization vs float32 |
| Metadata Size | 90% smaller |
Gravitas-Pack compression |
| Test Suite | 476 tests |
All passing (pytest) |
| Cold Start | 3 seconds |
Docker container ready |
Real-World Benchmarks
Environment: Docker on Apple M1 Mac, 16GB RAM
Document Set: 500 PDFs, ~2GB total
Health Check: 8ms
Search (cache hit): 9ms
Search (cache miss): 1,250ms (includes Gemini API call)
Batch Search (10): 2,500ms (parallel processing)
Upload (50MB file): 3,200ms (with indexing)
Architecture 🏗️
flowchart TD
Client(["Client\n(HTTP / SDK)"])
subgraph API["REST API Layer (FastAPI)"]
Upload["/api/upload"]
Search["/api/search"]
Admin["/api/admin"]
end
subgraph Engine["Engine Layer"]
FP["FileParser\n+ BackendRegistry\n(34 formats)"]
Cache["ParseCache\n(mtime-based)"]
Chunker["TextChunker\n+ KnowledgeAtom\n(chunk atoms)"]
DSP["DSP v2.0\nEmbedding Generator\n(<1ms, zero-ML)"]
BM25["BM25 + RRF\nHybrid Search\n(v1.6.0)"]
Scorer["SemanticScorer\n+ TypoCorrector"]
end
subgraph Storage["Storage Layer"]
SQLite[("SQLite\nMetadata Store")]
Vec[("Vector Store\n(local / pgvector)")]
Redis[("Redis Cache\n(optional)")]
end
subgraph LLM["LLM Provider (env: LLM_PROVIDER)"]
Gemini["Gemini\n(cloud)"]
Ollama["Ollama\n(local)"]
OAI["OpenAI /\nAnthropic /\nCompatible"]
end
Metrics["Metrics Logger"]
Client --> Upload & Search & Admin
Upload --> FP
FP <-->|"cache hit/miss"| Cache
FP --> Chunker
Chunker --> DSP
DSP --> Vec
FP --> SQLite
Search --> Scorer
Scorer --> DSP
DSP --> Vec
Scorer -->|"gemini"| Gemini
Scorer -->|"ollama"| Ollama
Scorer -->|"openai/anthropic"| OAI
LLM --> Client
Admin --> Metrics
Admin --> SQLite
Storage <-->|"read / write"| Redis
Full layer detail: Architecture.md
Security 🔒
FLAMEHAVEN takes security seriously:
- ✅ API Key Hashing - SHA256 with salt
- ✅ Rate Limiting - Per-key quotas (default: 100/min)
- ✅ Permission System - Granular access control
- ✅ Audit Logging - Complete request history
- ✅ OWASP Headers - Security headers enabled by default
- ✅ Input Validation - Strict file type and size checks
Security Best Practices
# Use strong admin keys
export FLAMEHAVEN_ADMIN_KEY=$(openssl rand -base64 32)
# Enable HTTPS in production
# (use nginx/traefik as reverse proxy)
# Rotate API keys regularly
curl -X DELETE http://localhost:8000/api/admin/keys/old_key_id \
-H "X-Admin-Key: $FLAMEHAVEN_ADMIN_KEY"
Roadmap 🗺️
Full roadmap: ROADMAP.md
v1.4.x (Completed)
- Multimodal search (image + text)
- HNSW vector indexing for faster search
- OAuth2/OIDC integration
- PostgreSQL backend (metadata + pgvector)
- Usage-budget controls and reporting
- pgvector tuning and reliability hardening
- CI/CD — ruff replaces flake8; pipelines fully green
v1.5.x (Completed)
- Universal Document Parser — 34 formats, zero doc-AI dependency (v1.5.0)
- Internal text chunker — structure-aware + token-aware, zero ML deps (v1.5.0)
- Framework integrations — LangChain, LlamaIndex, Haystack, CrewAI (v1.5.0)
- Backend Plugin Architecture —
AbstractFormatBackend+BackendRegistry(v1.5.2) - Parse cache — mtime-based,
extract_text(use_cache=True)(v1.5.2) - ContextExtractor — sliding-window RAG chunk enrichment (v1.5.2)
- Multi-provider LLM support — OpenAI, Claude, Ollama, Gemini (v1.5.3)
v1.6.0 (Completed)
- BM25 + RRF hybrid search — Korean+English tokenizer, lazy per-store index
- KnowledgeAtom 2-level indexing — chunk atoms with fragment URIs
- Stable URI scheme —
local://<store>/<quote(abs_path)>, collision-free - core.py mixin segmentation — 1258 → 221 lines, 3 focused modules
- Fix:
search_streamdouble intent-refine bug
v1.6.1 (Completed)
- CC reduction —
seek_vector_resonanceCC 8→2,_get_admin_userCC 10→1 - Dispatch table pattern —
_transform_dictunifies GravitasPacker compress/decompress -
_record_upload_failurehelper — eliminates 2× duplicated metrics blocks in api.py -
/healthexposesllm_provider+llm_model— frontend can detect active backend -
config.to_dict()exposesllm_provider,local_model,ollama_base_url - Frontend: provider-aware model selector (Gemini dropdown ↔ local model badge)
- Frontend: upload accept list expanded to all 34 supported formats
- Frontend: store datalist auto-populated from
/api/metrics - Frontend: version badge synced to
v1.6.1across all 6 dashboard pages - Ruff F401/F841 — 5 lint errors resolved, CI green
- Admin: Stores tab — create / list / delete stores (
POST|GET|DELETE /api/stores) - Admin: Ops tab — usage stats (
GET /api/admin/usage) + vector ops (stats / reindex / vacuum) - Landing: "Manage" deep-link to
admin.html#storeswith hash-based tab routing
v1.6.2 (Completed)
-
engine/quality_gate.py—SearchQualityGate(PASS/FORGE/INHIBIT),SearchMetaLearner(EMA alpha adaptation),compute_search_confidence(Jaccard rank divergence, zero new deps) - Hybrid search: confidence-scored results with FORGE keyword augmentation and INHIBIT flag
-
search_confidence+low_confidencefields in search response schema - BM25 pool size self-adapts via meta-learner alpha (keyword-dominant → larger pool)
- 25 tests, 99% coverage on
quality_gate.py
v2.0.0 (Q3 2026)
- Multi-language support (15+ languages) — multilingual stopwords + jieba
- Kubernetes Helm charts
- Distributed indexing
Troubleshooting 🐛
❌ 401 Unauthorized Error
Problem: API returns 401 when making requests.
Solutions:
- Verify
FLAMEHAVEN_ADMIN_KEYenvironment variable is set - Check
Authorization: Bearer sk_live_...header format - Ensure API key hasn't expired (check admin dashboard)
# Debug: Check if admin key is set
echo $FLAMEHAVEN_ADMIN_KEY
# Regenerate API key
curl -X POST http://localhost:8000/api/admin/keys \
-H "X-Admin-Key: $FLAMEHAVEN_ADMIN_KEY" \
-d '{"name":"debug","permissions":["search"]}'
🐌 Slow Search Performance
Problem: Searches taking >5 seconds.
Solutions:
- Check cache hit rate:
FLAMEHAVEN_METRICS_ENABLED=1 curl http://localhost:8000/metrics - Enable Redis for distributed caching
- Verify Gemini API latency (should be <1.5s)
# Enable Redis caching
docker run -d --name redis redis:7-alpine
export REDIS_HOST=localhost
💾 High Memory Usage
Problem: Container using >2GB RAM.
Solutions:
- Enable Redis with LRU eviction policy
- Reduce max file size in config
- Monitor with Prometheus endpoint
# Configure Redis memory limit
docker run -d \
-p 6379:6379 \
redis:7-alpine \
--maxmemory 512mb \
--maxmemory-policy allkeys-lru
More solutions in our Wiki Troubleshooting Guide.
Documentation 📚
Documentation Hub
Use the links below to jump to the most relevant guide.
| Topic | Description |
|---|---|
| Document Parsing | Supported formats, internal parsers, RAG chunking |
| Hybrid Search | BM25+RRF, KnowledgeAtom indexing, stable URI scheme (v1.6.0) |
| Framework Integrations | LangChain, LlamaIndex, Haystack, CrewAI adapters |
| API Reference | REST endpoints, payloads, rate limits |
| Architecture | How all layers fit together (v1.6.0) |
| Configuration Reference | Full list of environment variables and config fields |
| Production Deployment | Docker, systemd, reverse proxy, scaling tips |
| Troubleshooting | Step-by-step debugging playbook |
| Benchmarks | Performance measurements and methodology |
These Markdown files live inside the repository so they stay versioned alongside the code. Feel free to contribute improvements via pull requests.
Additional Resources
- Interactive API Docs - OpenAPI/Swagger interface (when server is running)
- CHANGELOG - Version history and breaking changes
- CONTRIBUTING - How to contribute code
- Examples - Sample integrations and use cases
Contributing 🤝
We love contributions! FLAMEHAVEN is better because of developers like you.
Good First Issues
- 🟢 [Easy] Add dark mode to admin dashboard (1-2 hours)
- 🟡 [Medium] PostgreSQL backend for usage tracker (multi-instance deployments)
- 🔴 [Advanced] Kubernetes Helm charts for production deployment
See CONTRIBUTING.md for development setup and guidelines.
Contributors
Community & Support 💬
- 💬 Discussions: GitHub Discussions
- 🐛 Bug Reports: GitHub Issues
- 🔒 Security: security@flamehaven.space
- 📧 General: info@flamehaven.space
License 📄
Distributed under the MIT License. See LICENSE for more information.
🙏 Acknowledgments
Built with amazing open source tools:
- FastAPI - Modern Python web framework
- Google Gemini - Semantic understanding and reasoning
- SQLite - Lightweight, embedded database
- Redis - In-memory caching (optional)
⭐ Star us on GitHub • 📖 Read the Docs • 🚀 Deploy Now
Built with 🔥 by the Flamehaven Core Team
Last updated: April 23, 2026 • Version 1.6.2
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file flamehaven_filesearch-1.6.2.tar.gz.
File metadata
- Download URL: flamehaven_filesearch-1.6.2.tar.gz
- Upload date:
- Size: 129.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3dae4c0672e5919803e9749bfd85b53cae0fcc0359160f5190049b30d0435ab2
|
|
| MD5 |
55cdce4679ea5d693432893c9cca1e24
|
|
| BLAKE2b-256 |
8ca6a98b5a8f46b50f632c95e794c949607b73e1d4340a6e463ca77df54e3ed7
|
Provenance
The following attestation bundles were made for flamehaven_filesearch-1.6.2.tar.gz:
Publisher:
publish.yml on flamehaven01/Flamehaven-Filesearch
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
flamehaven_filesearch-1.6.2.tar.gz -
Subject digest:
3dae4c0672e5919803e9749bfd85b53cae0fcc0359160f5190049b30d0435ab2 - Sigstore transparency entry: 1362217233
- Sigstore integration time:
-
Permalink:
flamehaven01/Flamehaven-Filesearch@5424077a53c3f25f6bba063c9db0a8fc25a086f5 -
Branch / Tag:
refs/tags/v1.6.2 - Owner: https://github.com/flamehaven01
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@5424077a53c3f25f6bba063c9db0a8fc25a086f5 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file flamehaven_filesearch-1.6.2-py3-none-any.whl.
File metadata
- Download URL: flamehaven_filesearch-1.6.2-py3-none-any.whl
- Upload date:
- Size: 139.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c84df048a6e7a5c24358f0230f516ba24bcf7c584ced8a77274378c5c347f544
|
|
| MD5 |
867cf0d83d762141af45db1eabd9f41c
|
|
| BLAKE2b-256 |
aab29f4270b116c600eb536f8f263e5096e967705e1611d9a008e2069af3652b
|
Provenance
The following attestation bundles were made for flamehaven_filesearch-1.6.2-py3-none-any.whl:
Publisher:
publish.yml on flamehaven01/Flamehaven-Filesearch
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
flamehaven_filesearch-1.6.2-py3-none-any.whl -
Subject digest:
c84df048a6e7a5c24358f0230f516ba24bcf7c584ced8a77274378c5c347f544 - Sigstore transparency entry: 1362217302
- Sigstore integration time:
-
Permalink:
flamehaven01/Flamehaven-Filesearch@5424077a53c3f25f6bba063c9db0a8fc25a086f5 -
Branch / Tag:
refs/tags/v1.6.2 - Owner: https://github.com/flamehaven01
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@5424077a53c3f25f6bba063c9db0a8fc25a086f5 -
Trigger Event:
workflow_dispatch
-
Statement type: