Skip to main content

Automatic RAG Pattern Optimization Engine

Project description

ragit

A Python toolkit for building Retrieval-Augmented Generation (RAG) applications. Ragit provides document loading, chunking, vector search, and LLM integration out of the box, allowing you to build document Q&A systems and code generators with minimal boilerplate.

Table of Contents

  1. Installation
  2. Configuration
  3. Tutorial: Using Ragit
  4. Tutorial: Platform Integration
  5. Advanced: Hyperparameter Optimization
  6. API Reference
  7. License

Installation

pip install ragit

Ragit requires an Ollama-compatible API for embeddings and LLM inference. You can use:

  • A local Ollama instance (https://ollama.ai)
  • A cloud-hosted Ollama API
  • Any OpenAI-compatible API endpoint

Configuration

Ragit reads configuration from environment variables. Create a .env file in your project root:

# LLM API (cloud or local)
OLLAMA_BASE_URL=https://your-ollama-api.com
OLLAMA_API_KEY=your-api-key

# Embedding API (can be different from LLM)
OLLAMA_EMBEDDING_URL=http://localhost:11434

# Default models
RAGIT_DEFAULT_LLM_MODEL=llama3.1:8b
RAGIT_DEFAULT_EMBEDDING_MODEL=mxbai-embed-large

A common setup is to use a cloud API for LLM inference (faster, more capable models) while running embeddings locally (lower latency, no API costs for indexing).

Tutorial: Using Ragit

This section covers the core functionality of ragit: loading documents, creating a RAG assistant, and querying your knowledge base.

Loading Documents

Ragit provides several functions for loading and chunking documents.

Loading a single file:

from ragit import load_text

doc = load_text("docs/api-reference.md")
print(doc.id)       # "api-reference"
print(doc.content)  # Full file contents

Loading a directory:

from ragit import load_directory

# Load all markdown files
docs = load_directory("docs/", "*.md")

# Load recursively
docs = load_directory("docs/", "**/*.md", recursive=True)

# Load multiple file types
txt_docs = load_directory("docs/", "*.txt")
rst_docs = load_directory("docs/", "*.rst")
all_docs = txt_docs + rst_docs

Custom chunking:

For fine-grained control over how documents are split:

from ragit import chunk_text, chunk_by_separator, chunk_rst_sections

# Fixed-size chunks with overlap
chunks = chunk_text(
    text,
    chunk_size=512,      # Characters per chunk
    chunk_overlap=50,    # Overlap between chunks
    doc_id="my-doc"
)

# Split by paragraph
chunks = chunk_by_separator(text, separator="\n\n")

# Split RST documents by section headers
chunks = chunk_rst_sections(rst_content, doc_id="tutorial")

The RAGAssistant Class

The RAGAssistant class is the main interface for RAG operations. It handles document indexing, retrieval, and generation in a single object.

from ragit import RAGAssistant

# Create from a directory
assistant = RAGAssistant("docs/")

# Create from a single file
assistant = RAGAssistant("docs/tutorial.rst")

# Create from Document objects
from ragit import Document

docs = [
    Document(id="intro", content="Introduction to the API..."),
    Document(id="auth", content="Authentication uses JWT tokens..."),
    Document(id="endpoints", content="Available endpoints: /users, /items..."),
]
assistant = RAGAssistant(docs)

Configuration options:

assistant = RAGAssistant(
    "docs/",
    embedding_model="mxbai-embed-large",  # Model for embeddings
    llm_model="llama3.1:70b",             # Model for generation
    chunk_size=512,                        # Characters per chunk
    chunk_overlap=50,                      # Overlap between chunks
)

Asking Questions

The ask() method retrieves relevant context and generates an answer:

assistant = RAGAssistant("docs/")

answer = assistant.ask("How do I authenticate API requests?")
print(answer)

Customizing the query:

answer = assistant.ask(
    "How do I authenticate API requests?",
    top_k=5,                    # Number of chunks to retrieve
    temperature=0.3,            # Lower = more focused answers
    system_prompt="You are a technical documentation assistant. "
                  "Answer concisely and include code examples."
)

Generating Code

The generate_code() method is optimized for producing clean, runnable code:

assistant = RAGAssistant("framework-docs/")

code = assistant.generate_code(
    "Create a REST API endpoint for user registration",
    language="python"
)
print(code)

The output is clean code without markdown formatting. The assistant uses your documentation as context to generate framework-specific, idiomatic code.

Custom Retrieval

For advanced use cases, you can access the retrieval and generation steps separately:

assistant = RAGAssistant("docs/")

# Step 1: Retrieve relevant chunks
results = assistant.retrieve("authentication", top_k=5)
for chunk, score in results:
    print(f"Score: {score:.3f}")
    print(f"Content: {chunk.content[:200]}...")
    print()

# Step 2: Get formatted context string
context = assistant.get_context("authentication", top_k=3)

# Step 3: Generate with custom prompt
prompt = f"""Based on this documentation:

{context}

Write a Python function that validates a JWT token."""

response = assistant.generate(
    prompt,
    system_prompt="You are an expert Python developer.",
    temperature=0.2
)

Tutorial: Platform Integration

This section shows how to integrate ragit into web applications and other platforms.

Flask Integration

from flask import Flask, request, jsonify
from ragit import RAGAssistant

app = Flask(__name__)

# Initialize once at startup
assistant = RAGAssistant("docs/")

@app.route("/ask", methods=["POST"])
def ask():
    data = request.get_json()
    question = data.get("question", "")

    if not question:
        return jsonify({"error": "question is required"}), 400

    answer = assistant.ask(question, top_k=3)
    return jsonify({"answer": answer})

@app.route("/search", methods=["GET"])
def search():
    query = request.args.get("q", "")
    top_k = int(request.args.get("top_k", 5))

    results = assistant.retrieve(query, top_k=top_k)
    return jsonify({
        "results": [
            {"content": chunk.content, "score": score}
            for chunk, score in results
        ]
    })

if __name__ == "__main__":
    app.run(debug=True)

FastAPI Integration

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from ragit import RAGAssistant

app = FastAPI()

# Initialize once at startup
assistant = RAGAssistant("docs/")

class Question(BaseModel):
    question: str
    top_k: int = 3
    temperature: float = 0.7

class Answer(BaseModel):
    answer: str

@app.post("/ask", response_model=Answer)
async def ask(q: Question):
    if not q.question.strip():
        raise HTTPException(status_code=400, detail="question is required")

    answer = assistant.ask(
        q.question,
        top_k=q.top_k,
        temperature=q.temperature
    )
    return Answer(answer=answer)

@app.get("/search")
async def search(q: str, top_k: int = 5):
    results = assistant.retrieve(q, top_k=top_k)
    return {
        "results": [
            {"content": chunk.content, "score": score}
            for chunk, score in results
        ]
    }

Command-Line Tools

Build CLI tools using argparse or click:

#!/usr/bin/env python3
import argparse
from ragit import RAGAssistant

def main():
    parser = argparse.ArgumentParser(description="Query documentation")
    parser.add_argument("question", help="Question to ask")
    parser.add_argument("--docs", default="docs/", help="Documentation path")
    parser.add_argument("--top-k", type=int, default=3, help="Context chunks")
    args = parser.parse_args()

    assistant = RAGAssistant(args.docs)
    answer = assistant.ask(args.question, top_k=args.top_k)
    print(answer)

if __name__ == "__main__":
    main()

Usage:

python ask.py "How do I configure logging?"
python ask.py "What are the API rate limits?" --docs api-docs/ --top-k 5

Batch Processing

Process multiple questions or generate reports:

from ragit import RAGAssistant

assistant = RAGAssistant("docs/")

questions = [
    "What authentication methods are supported?",
    "How do I handle errors?",
    "What are the rate limits?",
]

# Process questions
results = {}
for question in questions:
    results[question] = assistant.ask(question)

# Generate a report
with open("qa-report.md", "w") as f:
    f.write("# Documentation Q&A Report\n\n")
    for question, answer in results.items():
        f.write(f"## {question}\n\n")
        f.write(f"{answer}\n\n")

Advanced: Hyperparameter Optimization

Ragit includes tools to find the optimal RAG configuration for your specific documents and use case.

from ragit import RagitExperiment, Document, BenchmarkQuestion

# Your documents
documents = [
    Document(id="auth", content="Authentication uses Bearer tokens..."),
    Document(id="api", content="The API supports GET, POST, PUT, DELETE..."),
]

# Benchmark questions with expected answers
benchmark = [
    BenchmarkQuestion(
        question="What authentication method does the API use?",
        ground_truth="The API uses Bearer token authentication."
    ),
    BenchmarkQuestion(
        question="What HTTP methods are supported?",
        ground_truth="GET, POST, PUT, and DELETE methods are supported."
    ),
]

# Run optimization
experiment = RagitExperiment(documents, benchmark)
results = experiment.run(max_configs=20)

# Get the best configuration
best = results[0]
print(f"Best config: chunk_size={best.config.chunk_size}, "
      f"chunk_overlap={best.config.chunk_overlap}, "
      f"top_k={best.config.top_k}")
print(f"Score: {best.score:.3f}")

The experiment tests different combinations of chunk sizes, overlaps, and retrieval parameters to find what works best for your content.

API Reference

Document Loading

Function Description
load_text(path) Load a single text file as a Document
load_directory(path, pattern, recursive=False) Load files matching a glob pattern
chunk_text(text, chunk_size, chunk_overlap, doc_id) Split text into overlapping chunks
chunk_document(doc, chunk_size, chunk_overlap) Split a Document into chunks
chunk_by_separator(text, separator, doc_id) Split text by a delimiter
chunk_rst_sections(text, doc_id) Split RST by section headers

RAGAssistant

Method Description
retrieve(query, top_k=3) Return list of (Chunk, score) tuples
get_context(query, top_k=3) Return formatted context string
generate(prompt, system_prompt, temperature) Generate text without retrieval
ask(question, system_prompt, top_k, temperature) Retrieve context and generate answer
generate_code(request, language, top_k, temperature) Generate clean code

Properties

Property Description
assistant.num_documents Number of loaded documents
assistant.num_chunks Number of indexed chunks
assistant.embedding_model Current embedding model
assistant.llm_model Current LLM model

License

Apache-2.0 - RODMENA LIMITED

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ragit-0.7.1.tar.gz (28.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ragit-0.7.1-py3-none-any.whl (29.2 kB view details)

Uploaded Python 3

File details

Details for the file ragit-0.7.1.tar.gz.

File metadata

  • Download URL: ragit-0.7.1.tar.gz
  • Upload date:
  • Size: 28.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for ragit-0.7.1.tar.gz
Algorithm Hash digest
SHA256 bff775dd5a0d40c043d016c755ae70a4c46fb70436282d7da1058def27e9e209
MD5 332d3507d3e0a3b32b78a96575bba81d
BLAKE2b-256 4edb672a88bcf0be262227070b02d8de6279b2999b52f363643eb89320c49bb0

See more details on using hashes here.

File details

Details for the file ragit-0.7.1-py3-none-any.whl.

File metadata

  • Download URL: ragit-0.7.1-py3-none-any.whl
  • Upload date:
  • Size: 29.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for ragit-0.7.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d10e4b51a9079b11cb9c55db62e9f3526ad0ffe1b58423daa5390b3e1d11c16c
MD5 7140dcd4f992f7f4553f43d6f76b2b61
BLAKE2b-256 e71f865d611e294d1e076103da739721a154bf67b13437a9fda1ac089584883a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page