Aquiles-RAG is a high-performance Augmented Recovery-Generation (RAG) solution based on Redis, Qdrant or PostgreSQLRAG. It offers a high-level interface using FastAPI REST APIs.

These details have not been verified by PyPI

Project links

Project description

Aquiles-RAG

Self-hosted RAG infrastructure with MCP Server support
🚀 FastAPI • Redis / Qdrant / PostgreSQL • Async • Embedding-agnostic • MCP Ready

🎯 What is Aquiles-RAG?

Aquiles-RAG is a production-ready RAG (Retrieval-Augmented Generation) API server that brings high-performance vector search to your applications. Choose your backend (Redis, Qdrant, or PostgreSQL), connect your embedding model, and start building intelligent search systems in minutes.

Why Aquiles-RAG?

Challenge	Aquiles-RAG Solution
💸 Expensive vector databases	Use Redis, Qdrant, or PostgreSQL you already have
🔒 Data leaves your infrastructure	Everything runs on your servers
🔧 Complex RAG setup	Interactive wizard configures everything
🐌 Slow integrations	Async clients, batch operations, optimized pipelines
🚫 Vendor lock-in	Switch backends without changing code

Key Features

🔌 Backend Flexibility - Redis HNSW, Qdrant, or PostgreSQL pgvector
⚡ High Performance - Async operations, batch processing, optimized search
🤖 MCP Server Built-in - Native Model Context Protocol support for AI assistants
🛠️ Interactive Setup - CLI wizard configures your entire stack
🔄 Sync & Async Clients - Python and TypeScript/JavaScript SDKs included
📊 Optional Re-ranking - Improve results with semantic re-scoring

🚀 Quick Start

Installation

pip install aquiles-rag

Interactive Setup

Configure your vector database in seconds:

aquiles-rag configs

The wizard guides you through:

Backend selection (Redis, Qdrant, or PostgreSQL)
Connection settings (host, port, credentials)
TLS/gRPC options
Optional re-ranker configuration

Start Server

aquiles-rag serve --host "0.0.0.0" --port 5500

Your First RAG Query

from aquiles.client import AquilesRAG

client = AquilesRAG(host="http://127.0.0.1:5500", api_key="YOUR_API_KEY")

# Create index
client.create_index("documents", embeddings_dim=768, dtype="FLOAT32")

# Store document with your embedding function
def get_embedding(text):
    return your_embedding_model.encode(text)

client.send_rag(
    embedding_func=get_embedding,
    index="documents",
    name_chunk="intro",
    raw_text="Your document text here..."
)

# Query
results = client.query("documents", query_embedding, top_k=5)
print(results)

That's it! You now have a working RAG system.

🎨 Supported Backends

Backend	Features	Best For
Redis	HNSW indexing, fast in-memory search	Speed-critical applications
Qdrant	HTTP/gRPC, collections, filters	Scalable production systems
PostgreSQL	pgvector extension, SQL integration	Existing Postgres infrastructure

All backends support:

Vector similarity search (cosine, inner product)
Metadata filtering
Batch operations
Optional re-ranking

🤖 MCP Server Integration

Aquiles-RAG includes a built-in Model Context Protocol server for seamless AI assistant integration.

Start MCP Server

aquiles-rag mcp-serve --host "0.0.0.0" --port 5500 --transport "sse"

Example with OpenAI Agent

from agents import Agent, Runner
from agents.mcp import MCPServerSse

# Connect to MCP server
mcp_server = MCPServerSse({
    "url": "http://localhost:5500/sse",
    "headers": {"X-API-Key": "YOUR_API_KEY"}
})
await mcp_server.connect()

# Create agent with RAG tools
agent = Agent(
    name="RAG Assistant",
    instructions="You can store and query documents using the vector database.",
    mcp_servers=[mcp_server],
    model="gpt-4"
)

# Agent now has access to:
# - create_index
# - send_info (store documents)
# - query_rag (semantic search)
# - list_indexes
# - delete_index

result = await Runner.run(agent, "Store this document and find similar content")

MCP Tools Available:

Index management (create, list, delete)
Document ingestion with automatic chunking
Semantic search with configurable parameters
Metadata filtering

💻 Client SDKs

Python - Async Client

from aquiles.client import AsyncAquilesRAG

client = AsyncAquilesRAG(host="http://127.0.0.1:5500", api_key="YOUR_API_KEY")

async def main():
    # Create index
    await client.create_index("docs", embeddings_dim=1536)
    
    # Store documents (parallel chunking)
    await client.send_rag(
        embedding_func=async_get_embedding,
        index="docs",
        name_chunk="document_1",
        raw_text=long_text,
        metadata={
            "author": "John Doe",
            "source": "documentation"
        }
    )
    
    # Query
    results = await client.query("docs", query_embedding, top_k=5)
    print(results)

asyncio.run(main())

TypeScript/JavaScript

npm install @aquiles-ai/aquiles-rag-client

import { AsyncAquilesRAG } from '@aquiles-ai/aquiles-rag-client';
import OpenAI from 'openai';

const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

async function getEmbedding(text: string): Promise<number[]> {
    const resp = await openai.embeddings.create({
        model: "text-embedding-3-small",
        input: text,
    });
    return resp.data[0].embedding;
}

const client = new AsyncAquilesRAG({
    host: 'http://127.0.0.1:5500',
    apiKey: 'your-api-key',
});

// Create index (1536 dimensions for text-embedding-3-small)
await client.createIndex('my_docs', 1536, 'FLOAT32');

// Store document
await client.sendRAG(
    getEmbedding,
    'my_docs',
    'doc_1',
    'Your document text...',
    {
        embeddingModel: 'text-embedding-3-small',
        metadata: { author: 'John Doe' }
    }
);

// Query
const queryEmb = await getEmbedding('What is this about?');
const results = await client.query('my_docs', queryEmb, { topK: 5 });
console.log(results);

🛠️ Advanced Features

Optional Re-ranking

Improve search results with semantic re-scoring:

# Enable during setup wizard
aquiles-rag configs

Re-ranking refines results after vector search by scoring (query, document) pairs for better relevance.

Web UI Playground

Access the interactive UI:

http://localhost:5500/ui

Features:

Test index creation and queries
Inspect live configurations
Protected Swagger UI documentation
Real-time request/response monitoring

🏗️ Architecture

┌─────────────────────────────────────────────────────────────┐
│                         Clients                              │
│  HTTP/HTTPS • Python SDK • TypeScript SDK • MCP Server       │
└──────────────────────┬──────────────────────────────────────┘
                       │
┌──────────────────────▼──────────────────────────────────────┐
│                    FastAPI Server                            │
│  • Request validation                                        │
│  • Business logic orchestration                              │
│  • Optional re-ranking                                       │
└──────────────────────┬──────────────────────────────────────┘
                       │
┌──────────────────────▼──────────────────────────────────────┐
│                   Vector Store                               │
│  Redis HNSW  •  Qdrant Collections  •  PostgreSQL pgvector  │
└─────────────────────────────────────────────────────────────┘

Flow:

Client sends embedding + query parameters
Server validates and routes to vector store
Vector store returns top-k candidates
Optional re-ranker refines results
Formatted response returned to client

🎯 Use Cases

Who	What
🚀 AI Startups	Build RAG features without vendor costs
👨‍💻 Developers	Prototype semantic search quickly
🏢 Enterprises	Private, scalable document search
🔬 Researchers	Experiment with embeddings and retrieval

📋 Requirements

Python 3.9+
One of: Redis, Qdrant, or PostgreSQL with pgvector
pip or uv

Quick Redis Setup (Docker):

docker run -d --name redis-stack -p 6379:6379 redis/redis-stack-server:latest

PostgreSQL Note: Aquiles-RAG doesn't run automatic migrations. Create the pgvector extension and required tables manually before use.

🛠️ Tech Stack

FastAPI - High-performance async API framework
Redis / Qdrant / PostgreSQL - Vector storage backends
NumPy - Efficient array operations
Pydantic - Request/response validation
HTTPX - Async HTTP client
Click - CLI framework

📚 REST API Examples

Create Index

curl -X POST http://localhost:5500/create/index \
  -H "X-API-Key: YOUR_API_KEY" \
  -H 'Content-Type: application/json' \
  -d '{
    "indexname": "documents",
    "embeddings_dim": 768,
    "dtype": "FLOAT32"
  }'

Insert Document

curl -X POST http://localhost:5500/rag/create \
  -H "X-API-Key: YOUR_API_KEY" \
  -H 'Content-Type: application/json' \
  -d '{
    "index": "documents",
    "name_chunk": "doc1_part1",
    "raw_text": "Document content...",
    "embeddings": [0.12, 0.34, ...]
  }'

Query

curl -X POST http://localhost:5500/rag/query-rag \
  -H "X-API-Key: YOUR_API_KEY" \
  -H 'Content-Type: application/json' \
  -d '{
    "index": "documents",
    "embeddings": [0.78, 0.90, ...],
    "top_k": 5,
    "cosine_distance_threshold": 0.6
  }'

⚠️ Backend Notes

Redis:

Fast in-memory HNSW indexing
Full metrics via /status/ram
Supports HASH storage with COSINE search

Qdrant:

HTTP or gRPC connections
Collection-based organization
Limited metrics compared to Redis

PostgreSQL:

Requires manual pgvector setup
No automatic migrations
SQL-native filtering and joins
Check Postgres monitoring for metrics

📖 Documentation

Full Documentation

🤝 Contributing

We welcome contributions! See the test suite in test/ for examples:

Client SDK tests
API endpoint tests
Deployment validation

📄 License

Apache License

⭐ Star this project • 🐛 Report issues

Built with ❤️ for the AI community

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.5.5

Jan 6, 2026

0.5.0

Nov 29, 2025

0.4.54

Nov 22, 2025

0.4.53

Nov 21, 2025

0.4.52

Nov 21, 2025

0.4.51

Nov 21, 2025

0.4.5

Nov 20, 2025

0.4.2

Nov 15, 2025

0.4.0

Sep 7, 2025

0.3.75

Sep 1, 2025

0.3.72

Aug 29, 2025

0.3.7

Aug 29, 2025

0.3.6

Aug 29, 2025

0.3.4

Aug 28, 2025

0.3.3

Aug 21, 2025

0.3.2

Aug 20, 2025

0.3.1

Aug 18, 2025

0.3.0

Aug 18, 2025

0.2.9

Aug 14, 2025

0.2.8.5

Aug 11, 2025

0.2.8

Aug 9, 2025

0.2.7.1

Aug 6, 2025

0.2.7

Aug 6, 2025

0.2.6.1

Aug 3, 2025

0.2.6

Jul 31, 2025

0.2.5.4

Jul 30, 2025

0.2.5.3

Jul 30, 2025

0.2.5.2

Jul 29, 2025

0.2.5.1

Jul 29, 2025

0.2.5

Jul 27, 2025

0.2.2

Jul 25, 2025

0.2.1

Jul 25, 2025

0.2.0

Jul 24, 2025

0.1.9.1

Jul 24, 2025

0.1.9

Jul 23, 2025

0.1.8

Jul 23, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aquiles_rag-0.5.5.tar.gz (601.5 kB view details)

Uploaded Jan 6, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

aquiles_rag-0.5.5-py3-none-any.whl (585.6 kB view details)

Uploaded Jan 6, 2026 Python 3

File details

Details for the file aquiles_rag-0.5.5.tar.gz.

File metadata

Download URL: aquiles_rag-0.5.5.tar.gz
Upload date: Jan 6, 2026
Size: 601.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for aquiles_rag-0.5.5.tar.gz
Algorithm	Hash digest
SHA256	`33be2891f94906b28f31a1efaed90aa5e647ca4dfc1b433f15a4d20ef58e2323`
MD5	`4b63ddeae7675a4ec662968c5a6ff286`
BLAKE2b-256	`1d9ed124ae1822739e822c9fb3b7950c21a333247171d71b314d2033e4e10ce6`

See more details on using hashes here.

File details

Details for the file aquiles_rag-0.5.5-py3-none-any.whl.

File metadata

Download URL: aquiles_rag-0.5.5-py3-none-any.whl
Upload date: Jan 6, 2026
Size: 585.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for aquiles_rag-0.5.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`98d25afcf64f463aa75314ede1dcd8af6eb9da81092ded34c36e03aab14184d2`
MD5	`10f41eeea318a269e3ecdff50cd69621`
BLAKE2b-256	`c885eb44e60687c7862042c66204bc8a1e4ce29b9f5e0dd31a545ae2af5ff030`

See more details on using hashes here.

aquiles-rag 0.5.5

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Aquiles-RAG

🎯 What is Aquiles-RAG?

Why Aquiles-RAG?

Key Features

🚀 Quick Start

Installation

Interactive Setup

Start Server

Your First RAG Query

🎨 Supported Backends

🤖 MCP Server Integration

Start MCP Server

Example with OpenAI Agent

💻 Client SDKs

Python - Async Client

TypeScript/JavaScript

🛠️ Advanced Features

Optional Re-ranking

Web UI Playground

🏗️ Architecture

🎯 Use Cases

📋 Requirements

🛠️ Tech Stack

📚 REST API Examples

Create Index

Insert Document

Query

⚠️ Backend Notes

📖 Documentation

🤝 Contributing

📄 License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes