Production-ready RAG + AI Agent platform with FastAPI, LangGraph, and OpenRouter
Project description
rag-agent
Production-ready RAG + AI Agent platform — FastAPI · LangGraph · OpenRouter · ChromaDB
Quickstart
cp .env.example .env # add your OPENROUTER_API_KEY
make install # install deps + pre-commit hooks
make up # start all services (Docker)
make migrate # create DB schema
uv run rag-agent create-key mykey # create first API key
make dev # FastAPI on :8000 → /docs for Swagger
Architecture
graph TB
Client -->|X-API-Key| Auth
subgraph API ["FastAPI :8000"]
Auth["Auth\n(api_keys table)"]
Chat["/chat"]
Agent["/agent\nLangGraph"]
ReAct["/agent/run\nReAct + SSE"]
Ingest["/ingest"]
OCR["/ocr"]
end
Auth --> Chat & Agent & ReAct & Ingest & OCR
subgraph Services
Guard["Guardrails\nPII · Toxicity"]
Cache["Semantic Cache\n(Redis, sim > 0.92)"]
Retriever["Hybrid Retriever\nDense + BM25 + RRF\n+ Cross-encoder"]
LLM["LLM Client\n(OpenRouter)"]
Langfuse["Langfuse Tracing"]
end
Chat --> Guard --> Cache
Cache -->|miss| Retriever
Retriever --> LLM --> Langfuse
Agent --> Retriever
ReAct -->|tool_call| WebSearch & RAGSearch
Ingest -->|async| Celery
subgraph Storage
PG[("PostgreSQL\napi_keys · documents")]
Redis[("Redis\ncache · sessions · Celery")]
Chroma[("ChromaDB\nvectors")]
MinIO[("MinIO\nraw files")]
end
Auth --> PG
Cache --> Redis
Retriever --> Chroma
Celery --> Chroma & MinIO
API
All endpoints require X-API-Key header. See docs/api.md for full request/response reference.
| Endpoint | Method | Description |
|---|---|---|
/api/v1/chat |
POST | RAG question answering |
/api/v1/chat/stream |
GET | Streaming SSE tokens |
/api/v1/agent |
POST | LangGraph agent (grade → web fallback → hallucination check) |
/api/v1/agent/run |
POST | ReAct multi-step agent (sync) |
/api/v1/agent/run/stream |
GET | ReAct agent with SSE step-by-step |
/api/v1/agent/run/sessions/{id} |
GET | Session history |
/api/v1/agent/run/sessions/{id} |
DELETE | Clear session |
/api/v1/ingest/file |
POST | Upload PDF/DOCX/TXT (async, max 50 MB) |
/api/v1/ingest/text |
POST | Ingest raw text |
/api/v1/jobs/{id} |
GET | Celery task status |
/api/v1/ocr/extract |
POST | Image → structured JSON extraction |
/api/v1/ocr/extract/url |
POST | OCR from URL |
/api/v1/ocr/schemas |
GET | List supported document types |
/api/v1/keys |
POST | Create API key |
/api/v1/keys |
GET | List active keys |
/api/v1/keys/{id} |
DELETE | Revoke key |
/health |
GET | Health check |
/metrics |
GET | Prometheus metrics |
/docs |
GET | Swagger UI (dev only) |
Services
| Service | URL | Credentials |
|---|---|---|
| FastAPI | http://localhost:8000 | — |
| ChromaDB | http://localhost:8001 | — |
| MinIO Console | http://localhost:9001 | minioadmin / minioadmin |
| Langfuse | http://localhost:3000 | — |
| Grafana | http://localhost:3001 | admin / admin |
| n8n | http://localhost:5678 | admin / admin |
| Prometheus | http://localhost:9090 | — |
Key commands
make test-unit # fast unit tests (no Docker)
make test # full suite with coverage (min 80%)
make lint # ruff + mypy strict
make format # ruff format + autofix
make eval # Ragas quality evaluation (requires qa_dataset.json)
make eval-ocr # OCR accuracy eval → reports/ocr_eval_latest.json
make load # Locust load test (10 users, 30s)
make worker # Celery worker (required for async ingest)
make dashboard # Streamlit admin UI on :8501
make clean # remove __pycache__, caches, htmlcov
Documentation
- docs/api.md — full API reference with request/response examples
- docs/finetune.md — fine-tuning guide (LoRA/QLoRA via Unsloth)
- docs/bruno/ — Bruno API collection for manual endpoint testing
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rag_agent_2_0-0.1.0.tar.gz.
File metadata
- Download URL: rag_agent_2_0-0.1.0.tar.gz
- Upload date:
- Size: 455.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aa3d8ca63b8d2dc69da633cc024b2bcf0ba65f82c3c7b3066c60d69c5766869f
|
|
| MD5 |
425f084599b49f1f867049f45d3f7236
|
|
| BLAKE2b-256 |
3faeecbdcbd853c835f95f167992c5f5fc5e4a53196dccff7a2f96d32423541e
|
Provenance
The following attestation bundles were made for rag_agent_2_0-0.1.0.tar.gz:
Publisher:
cd.yml on SeydinaBANE/rag-agent
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
rag_agent_2_0-0.1.0.tar.gz -
Subject digest:
aa3d8ca63b8d2dc69da633cc024b2bcf0ba65f82c3c7b3066c60d69c5766869f - Sigstore transparency entry: 1633689222
- Sigstore integration time:
-
Permalink:
SeydinaBANE/rag-agent@c12286378bc61eb97d71b6bf2b8c4bf50eef7e1a -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/SeydinaBANE
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
cd.yml@c12286378bc61eb97d71b6bf2b8c4bf50eef7e1a -
Trigger Event:
push
-
Statement type:
File details
Details for the file rag_agent_2_0-0.1.0-py3-none-any.whl.
File metadata
- Download URL: rag_agent_2_0-0.1.0-py3-none-any.whl
- Upload date:
- Size: 62.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e628dac94cca9dbfde5ceabf10e1a05bc397a46609668f0c35136f8a0ba4fe40
|
|
| MD5 |
25ad46e76ad5c560d733b60ae1f66cd7
|
|
| BLAKE2b-256 |
ec098205ebafce4bb6332784b4d1e28ff7e0a2b11852d55d81a9ba01106a3e86
|
Provenance
The following attestation bundles were made for rag_agent_2_0-0.1.0-py3-none-any.whl:
Publisher:
cd.yml on SeydinaBANE/rag-agent
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
rag_agent_2_0-0.1.0-py3-none-any.whl -
Subject digest:
e628dac94cca9dbfde5ceabf10e1a05bc397a46609668f0c35136f8a0ba4fe40 - Sigstore transparency entry: 1633689241
- Sigstore integration time:
-
Permalink:
SeydinaBANE/rag-agent@c12286378bc61eb97d71b6bf2b8c4bf50eef7e1a -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/SeydinaBANE
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
cd.yml@c12286378bc61eb97d71b6bf2b8c4bf50eef7e1a -
Trigger Event:
push
-
Statement type: