Agent Brain RAG - Intelligent document indexing and semantic search server that gives AI agents long-term memory

These details have not been verified by PyPI

Project description

Agent Brain RAG Server

Agent Brain (formerly doc-serve) is an intelligent document indexing and semantic search system designed to give AI agents long-term memory.

AI agents need persistent memory to be truly useful. Agent Brain provides the retrieval infrastructure that enables context-aware, knowledge-grounded AI interactions.

Installation

pip install agent-brain-rag

Quick Start

Set environment variables:

export OPENAI_API_KEY=your-key
export ANTHROPIC_API_KEY=your-key

Start the server:
```
agent-brain-serve
```

The server will start at http://127.0.0.1:8000.

Note: The legacy command doc-serve is still available but deprecated. Please use agent-brain-serve for new installations.

Search Capabilities

Agent Brain provides multiple search strategies to match your retrieval needs:

Search Type	Description	Best For
Semantic Search	Natural language queries using OpenAI embeddings (`text-embedding-3-large`)	Conceptual questions, finding related content
Keyword Search (BM25)	Traditional keyword matching with TF-IDF ranking	Exact matches, technical terms, code identifiers
Hybrid Search	Combines vector + BM25 for best of both approaches	General-purpose queries, balanced recall/precision
GraphRAG	Knowledge graph-based retrieval for relationship-aware queries	Understanding connections, multi-hop reasoning

Features

Document Indexing: Load and index documents from folders (PDF, Markdown, TXT, DOCX, HTML)
AST-Aware Code Ingestion: Smart parsing for Python, TypeScript, JavaScript, Java, Go, Rust, C, C++
Multi-Strategy Retrieval: Semantic, keyword, hybrid, and graph-based search
OpenAI Embeddings: Uses text-embedding-3-large for high-quality embeddings
Claude Summarization: AI-powered code summaries for better context
Chroma Vector Store: Persistent, thread-safe vector database
FastAPI: Modern, high-performance REST API with OpenAPI documentation

Prerequisites

Python 3.10+
OpenAI API key (for embeddings)
Anthropic API key (for summarization)

GraphRAG Configuration (Feature 113)

Agent Brain supports optional GraphRAG (Graph-based Retrieval-Augmented Generation) for enhanced relationship-aware queries.

Enabling GraphRAG

Set the environment variable to enable graph indexing:

export ENABLE_GRAPH_INDEX=true

Configuration Options

Variable	Default	Description
`ENABLE_GRAPH_INDEX`	`false`	Enable/disable GraphRAG features
`GRAPH_STORE_TYPE`	`simple`	Graph backend: `simple` (JSON) or `kuzu` (embedded DB)
`GRAPH_MAX_TRIPLETS_PER_CHUNK`	`10`	Maximum entities to extract per document chunk
`GRAPH_USE_CODE_METADATA`	`true`	Extract relationships from code AST metadata
`GRAPH_USE_LLM_EXTRACTION`	`true`	Use LLM for entity extraction from documents
`GRAPH_TRAVERSAL_DEPTH`	`2`	Default traversal depth for graph queries

Query Modes

With GraphRAG enabled, you have access to additional query modes:

graph: Query using only the knowledge graph (entity relationships)
multi: Combines vector search, BM25, and graph results using RRF fusion

Example: Graph Query

# CLI
agent-brain query "authentication service" --mode graph

# API
curl -X POST http://localhost:8000/query \
  -H "Content-Type: application/json" \
  -d '{"query": "authentication service", "mode": "graph", "top_k": 10}'

Example: Multi-Mode Query

# CLI
agent-brain query "user login flow" --mode multi

# API
curl -X POST http://localhost:8000/query \
  -H "Content-Type: application/json" \
  -d '{"query": "user login flow", "mode": "multi", "top_k": 5}'

Rebuilding the Graph Index

To rebuild only the graph index without re-indexing documents:

curl -X POST "http://localhost:8000/index?rebuild_graph=true" \
  -H "Content-Type: application/json" \
  -d '{"folder_path": "."}'

Optional Dependencies

For enhanced GraphRAG features, install optional dependency groups:

# For Kuzu graph store (production workloads)
poetry install --with graphrag-kuzu

# For enhanced entity extraction
poetry install --with graphrag

Two-Stage Reranking (Feature 123)

Agent Brain supports optional two-stage retrieval with reranking for improved search precision. When enabled, the system:

Stage 1: Retrieves more candidates than requested (e.g., 50 candidates for top_k=5)
Stage 2: Reranks candidates using a cross-encoder model for more accurate relevance scoring

Enabling Reranking

Set the following environment variables:

# Enable two-stage reranking (default: false)
ENABLE_RERANKING=true

# Choose provider (default: sentence-transformers)
RERANKER_PROVIDER=sentence-transformers  # or "ollama"

# Choose model (default: cross-encoder/ms-marco-MiniLM-L-6-v2)
RERANKER_MODEL=cross-encoder/ms-marco-MiniLM-L-6-v2

# Stage 1 retrieval multiplier (default: 10)
RERANKER_TOP_K_MULTIPLIER=10

# Maximum candidates for Stage 1 (default: 100)
RERANKER_MAX_CANDIDATES=100

# Batch size for reranking inference (default: 32)
RERANKER_BATCH_SIZE=32

Provider Options

Provider	Model	Latency	Description
sentence-transformers	cross-encoder/ms-marco-MiniLM-L-6-v2	~50ms	Recommended. Fast, accurate cross-encoder.
sentence-transformers	cross-encoder/ms-marco-MiniLM-L-12-v2	~100ms	Slower but more accurate.
ollama	llama3.2:1b	~500ms	Fully local, no HuggingFace download.

YAML Configuration

You can also configure reranking in config.yaml:

reranker:
  provider: sentence-transformers
  model: cross-encoder/ms-marco-MiniLM-L-6-v2
  params:
    batch_size: 32

Graceful Degradation

If the reranker fails (model unavailable, timeout, etc.), the system automatically falls back to Stage 1 results. This ensures queries never fail due to reranking issues.

Response Fields

When reranking is enabled, query results include additional fields:

rerank_score: The cross-encoder relevance score
original_rank: The position before reranking (1-indexed)

Example response:

{
  "results": [
    {
      "text": "Document content...",
      "source": "docs/guide.md",
      "score": 0.95,
      "rerank_score": 0.95,
      "original_rank": 5,
      "chunk_id": "chunk_abc123"
    }
  ]
}

Development Installation

cd agent-brain-server
poetry install

Configuration

Copy the environment template and configure:

cp ../.env.example .env
# Edit .env with your API keys

Required environment variables:

OPENAI_API_KEY: Your OpenAI API key for embeddings
ANTHROPIC_API_KEY: Your Anthropic API key for summarization

Running the Server

# Development mode
poetry run uvicorn agent_brain_server.api.main:app --reload

# Or use the entry point
poetry run agent-brain-serve

API Documentation

Once running, visit:

Swagger UI: http://127.0.0.1:8000/docs
ReDoc: http://127.0.0.1:8000/redoc
OpenAPI JSON: http://127.0.0.1:8000/openapi.json

API Endpoints

Health

GET /health - Server health status
GET /health/status - Detailed indexing status

Indexing

POST /index - Start indexing documents from a folder
POST /index/add - Add documents to existing index
DELETE /index - Reset the index

Querying

POST /query - Semantic search query
GET /query/count - Get indexed document count

Example Usage

Index Documents

curl -X POST http://localhost:8000/index \
  -H "Content-Type: application/json" \
  -d '{"folder_path": "/path/to/docs"}'

Query Documents

curl -X POST http://localhost:8000/query \
  -H "Content-Type: application/json" \
  -d '{"query": "How do I configure authentication?", "top_k": 5}'

Architecture

agent_brain_server/
├── api/
│   ├── main.py           # FastAPI application
│   └── routers/          # Endpoint handlers
├── config/
│   └── settings.py       # Configuration management
├── models/               # Pydantic request/response models
├── indexing/
│   ├── document_loader.py  # Document loading
│   ├── chunking.py         # Text chunking
│   └── embedding.py        # Embedding generation
├── services/
│   ├── indexing_service.py # Indexing orchestration
│   └── query_service.py    # Query execution
└── storage/
    └── vector_store.py     # Chroma vector store

Development

Running Tests

poetry run pytest

Code Formatting

poetry run black agent_brain_server/
poetry run ruff check agent_brain_server/

Type Checking

poetry run mypy agent_brain_server/

Documentation

User Guide - Getting started and usage
Developer Guide - Contributing and development
API Reference - Full API documentation

Release Information

Current Version: See pyproject.toml
Release Notes: GitHub Releases
Changelog: Latest Release

Related Packages

agent-brain-cli - Command-line interface for Agent Brain

License

MIT

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

9.6.0

Apr 4, 2026

9.5.0

Mar 31, 2026

9.4.1

Mar 23, 2026

9.4.0

Mar 20, 2026

9.3.0

Mar 18, 2026

9.2.0

Mar 17, 2026

This version

9.0.0

Mar 16, 2026

8.0.0

Mar 16, 2026

7.0.0

Mar 6, 2026

6.0.3

Feb 21, 2026

6.0.1

Feb 15, 2026

6.0.0

Feb 14, 2026

5.0.0

Feb 11, 2026

4.0.0

Feb 6, 2026

3.0.0

Feb 4, 2026

2.0.0

Feb 2, 2026

1.2.0

Jan 31, 2026

1.1.0

Jan 29, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_brain_rag-9.0.0.tar.gz (155.4 kB view details)

Uploaded Mar 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agent_brain_rag-9.0.0-py3-none-any.whl (198.0 kB view details)

Uploaded Mar 16, 2026 Python 3

File details

Details for the file agent_brain_rag-9.0.0.tar.gz.

File metadata

Download URL: agent_brain_rag-9.0.0.tar.gz
Upload date: Mar 16, 2026
Size: 155.4 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for agent_brain_rag-9.0.0.tar.gz
Algorithm	Hash digest
SHA256	`b6202e83b00b8736bdda20502c3c708b2103eb1c41a356fe9d7cb4c66f1b81e5`
MD5	`5e30b8dab9b5356149b81e68355118e6`
BLAKE2b-256	`bde4f47deae97eab0caafcdf1566132d417d211def5c45bdae1f4e7e510e5e31`

See more details on using hashes here.

Provenance

The following attestation bundles were made for agent_brain_rag-9.0.0.tar.gz:

Publisher: publish-to-pypi.yml on SpillwaveSolutions/agent-brain

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: agent_brain_rag-9.0.0.tar.gz
- Subject digest: b6202e83b00b8736bdda20502c3c708b2103eb1c41a356fe9d7cb4c66f1b81e5
- Sigstore transparency entry: 1111856170
- Sigstore integration time: Mar 16, 2026
Source repository:
- Permalink: SpillwaveSolutions/agent-brain@48dc9dbf47de5c144e1d5aed7ffff3d74d3bf94e
- Branch / Tag: refs/tags/v9.0.0
- Owner: https://github.com/SpillwaveSolutions
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-to-pypi.yml@48dc9dbf47de5c144e1d5aed7ffff3d74d3bf94e
- Trigger Event: release

File details

Details for the file agent_brain_rag-9.0.0-py3-none-any.whl.

File metadata

Download URL: agent_brain_rag-9.0.0-py3-none-any.whl
Upload date: Mar 16, 2026
Size: 198.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for agent_brain_rag-9.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3872f76016748b4173459a0d28d5ee0ad1e9c8ab52b4f7b0f6f78e05230a3ee4`
MD5	`02d731912a910065baecc329d5ff8c27`
BLAKE2b-256	`474486ecdb4dca45d54ef1c286a6f8637083ed98f2c40b92e2d6001d2fee37e8`

See more details on using hashes here.

Provenance

The following attestation bundles were made for agent_brain_rag-9.0.0-py3-none-any.whl:

Publisher: publish-to-pypi.yml on SpillwaveSolutions/agent-brain

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: agent_brain_rag-9.0.0-py3-none-any.whl
- Subject digest: 3872f76016748b4173459a0d28d5ee0ad1e9c8ab52b4f7b0f6f78e05230a3ee4
- Sigstore transparency entry: 1111856201
- Sigstore integration time: Mar 16, 2026
Source repository:
- Permalink: SpillwaveSolutions/agent-brain@48dc9dbf47de5c144e1d5aed7ffff3d74d3bf94e
- Branch / Tag: refs/tags/v9.0.0
- Owner: https://github.com/SpillwaveSolutions
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-to-pypi.yml@48dc9dbf47de5c144e1d5aed7ffff3d74d3bf94e
- Trigger Event: release

agent-brain-rag 9.0.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Agent Brain RAG Server

Installation

Quick Start

Search Capabilities

Features

Prerequisites

GraphRAG Configuration (Feature 113)

Enabling GraphRAG

Configuration Options

Query Modes

Example: Graph Query

Example: Multi-Mode Query

Rebuilding the Graph Index

Optional Dependencies

Two-Stage Reranking (Feature 123)

Enabling Reranking

Provider Options

YAML Configuration

Graceful Degradation

Response Fields

Development Installation

Configuration

Running the Server

API Documentation

API Endpoints

Health

Indexing

Querying

Example Usage

Index Documents

Query Documents

Architecture

Development

Running Tests

Code Formatting

Type Checking

Documentation

Release Information

Related Packages

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance