Skip to main content

Aquiles-RAG is a high-performance Retrieval-Augmented Generation (RAG) solution built on Redis. It offers a high-level interface through FastAPI REST APIs

Project description

Aquiles‑RAG

Aquiles‑RAG Logo

High‑performance Retrieval‑Augmented Generation (RAG) on Redis
🚀 FastAPI • Redis Vector Search • Async • Embedding‑agnostic

📖 Documentation

📑 Table of Contents

  1. Features

  2. Tech Stack

  3. Requirements

  4. Installation

  5. Configuration & Connection Options

  6. Usage

  7. Architecture

  8. License

⭐ Features

  • 📈 High Performance: Redis-powered vector search using HNSW.
  • 🛠️ Simple API: Endpoints for index creation, insertion, and querying.
  • 🔌 Embedding‑agnostic: Works with any embedding model (OpenAI, Llama 3, etc.).
  • 💻 Integrated CLI: Configure and serve with built‑in commands.
  • 🧩 Extensible: Ready to integrate into ML pipelines or microservices.

🛠 Tech Stack

⚙️ Requirements

  1. Redis (standalone or cluster)
  2. Python 3.9+
  3. pip

Optional: Run Redis with Docker:

docker run -d --name redis-stack -p 6379:6379 redis/redis-stack-server:latest

🚀 Installation

Via PyPI

The easiest way is to install directly from PyPI:

pip install aquiles-rag

From Source (optional)

If you’d like to work from the latest code or contribute:

  1. Clone the repository and navigate into it:

    git clone https://github.com/Aquiles-ai/Aquiles-RAG.git
    cd Aquiles-RAG
    
  2. Create a virtual environment and install dependencies:

    python -m venv .venv
    source .venv/bin/activate
    pip install -r requirements.txt
    
  3. (Optional) Install in editable/development mode:

    pip install -e .
    

🔧 Configuration & Connection Options

Aquiles‑RAG stores its configuration in:

~/.local/share/aquiles/aquiles_config.json

By default, it uses:

{
  "local": true,
  "host": "localhost",
  "port": 6379,
  "username": "",
  "password": "",
  "cluster_mode": false,
  "tls_mode": false,
  "ssl_certfile": "",
  "ssl_keyfile": "",
  "ssl_ca_certs": "",
  "allows_api_keys": [""],
  "allows_users": [{"username": "root", "password": "root"}]
}

You can modify the config file manually or use the CLI:

aquiles-rag configs --host redis.example.com --port 6380 --username user --password pass

Redis Connection Modes

Aquiles‑RAG supports four modes to connect to Redis, based on your config:

  1. Local Cluster (local=true & cluster_mode=true)

    RedisCluster(host=host, port=port, decode_responses=True)
    
  2. Standalone Local (local=true)

    redis.Redis(host=host, port=port, decode_responses=True)
    
  3. Remote with TLS/SSL (local=false, tls_mode=true)

    redis.Redis(
      host=host,
      port=port,
      username=username or None,
      password=password or None,
      ssl=True,
      decode_responses=True,
      ssl_certfile=ssl_certfile,  # if provided
      ssl_keyfile=ssl_keyfile,    # if provided
      ssl_ca_certs=ssl_ca_certs   # if provided
    )
    
  4. Remote without TLS/SSL (local=false, tls_mode=false)

    redis.Redis(
      host=host,
      port=port,
      username=username or None,
      password=password or None,
      decode_responses=True
    )
    

These options give full flexibility to connect to any Redis topology securely.

📖 Usage

CLI

  • Save configs

    aquiles-rag configs --host "127.0.0.1" --port 6379
    
  • Serve the API

    aquiles-rag serve --host "0.0.0.0" --port 5500
    
  • Deploy custom config

    aquiles-rag deploy --host "0.0.0.0" --port 5500 --workers 4 my_config.py
    

REST API

  1. Create Index

    curl -X POST http://localhost:5500/create/index \
      -H "X-API-Key: YOUR_API_KEY" \
      -H 'Content-Type: application/json' \
      -d '{
        "indexname": "documents",
        "embeddings_dim": 768,
        "dtype": "FLOAT32",
        "delete_the_index_if_it_exists": false
      }'
    
  2. Insert Chunk

    curl -X POST http://localhost:5500/rag/create \
      -H "X-API-Key: YOUR_API_KEY" \
      -H 'Content-Type: application/json' \
      -d '{
        "index": "documents",
        "name_chunk": "doc1_part1",
        "dtype": "FLOAT32",
        "chunk_size": 1024,
        "raw_text": "Text of the chunk...",
        "embeddings": [0.12, 0.34, 0.56, ...]
      }'
    
  3. Query Top‑K

    curl -X POST http://localhost:5500/rag/query-rag \
      -H "X-API-Key: YOUR_API_KEY" \
      -H 'Content-Type: application/json' \
      -d '{
        "index": "documents",
        "embeddings": [0.78, 0.90, ...],
        "dtype": "FLOAT32",
        "top_k": 5,
        "cosine_distance_threshold": 0.6
      }'
    

Python Client

from aquiles.client import AquilesRAG

client = AquilesRAG(host="http://127.0.0.1:5500", api_key="YOUR_API_KEY")

# Create an index
client.create_index("documents", embeddings_dim=768, dtype="FLOAT32")

# Insert chunks using your embedding function
def get_embedding(text):
    # e.g. call OpenAI, Llama3, etc.
    return embedding_model.encode(text)

responses = client.send_rag(
    embedding_func=get_embedding,
    index="documents",
    name_chunk="doc1",
    raw_text=full_text
)

# Query the index
results = client.query("documents", query_embedding, top_k=5)
print(results)

UI Playground

Access the web UI (with basic auth) at:

http://localhost:5500/ui

Use it to:

  • Edit configurations live
  • Test /create/index, /rag/create, /rag/query-rag
  • Explore protected Swagger UI & ReDoc docs

🚀 Screenshots

  1. Playground Home
    Playground Home

  2. Live Configurations
    Live Configurations

  3. Creating an Index
    Creating an Index

  4. Adding Data to RAG
    Adding Data to RAG

  5. Querying RAG Results
    Querying RAG Results

🏗 Architecture

The following diagram shows the high‑level architecture of Aquiles‑RAG:

Architecture

  1. Clients (HTTP/HTTPS, Python SDK, or UI Playground) make asynchronous HTTP requests.
  2. FastAPI Server acts as the orchestration and business‑logic layer, validating requests and translating them to vector store commands.
  3. Redis / RedisCluster serves as the RAG vector store (HASH + HNSW/COSINE search).

Test Suite*: See the test/ direct*ory for automated tests:

  • client tests for the Python SDK
  • API tests for endpoint behavior
  • test_deploy.py for deployment configuration and startup validation

📄 License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aquiles_rag-0.2.5.1.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aquiles_rag-0.2.5.1-py3-none-any.whl (1.0 MB view details)

Uploaded Python 3

File details

Details for the file aquiles_rag-0.2.5.1.tar.gz.

File metadata

  • Download URL: aquiles_rag-0.2.5.1.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for aquiles_rag-0.2.5.1.tar.gz
Algorithm Hash digest
SHA256 3173e6d86c38f1eaadf1a55c65713439eb0fe2cf4521877c0c893d765a19ab15
MD5 6634a2d48c1c0193d10a0ea86f4f74ac
BLAKE2b-256 66f7a97ec55ba00d730e5a2a6a3573de01f5275d28cd7d07d56ee855feaa55f1

See more details on using hashes here.

File details

Details for the file aquiles_rag-0.2.5.1-py3-none-any.whl.

File metadata

  • Download URL: aquiles_rag-0.2.5.1-py3-none-any.whl
  • Upload date:
  • Size: 1.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for aquiles_rag-0.2.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 38704d82413579a9c4809dd82410c020c80834a5163d0fd72f975155a4d062a3
MD5 168665425de2111a109b4c05ed855e41
BLAKE2b-256 b73d96b455e9b6490ea947d406cdb37fe17a3b2ee20dca0b72fe6ea93b5f9c7d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page