Skip to main content

Automatic RAG Pattern Optimization Engine

Project description

ragit

RAG toolkit for Python. Document loading, chunking, vector search, LLM integration.

Installation

pip install ragit

# For offline embedding
pip install ragit[transformers]

Quick Start

You must provide an embedding source: custom function, SentenceTransformers, or any provider.

Custom Embedding Function

from ragit import RAGAssistant

def my_embed(text: str) -> list[float]:
    # Use any embedding API: OpenAI, Cohere, HuggingFace, etc.
    return embedding_vector

assistant = RAGAssistant("docs/", embed_fn=my_embed)
results = assistant.retrieve("search query")

With LLM for Q&A

def my_embed(text: str) -> list[float]:
    return embedding_vector

def my_generate(prompt: str, system_prompt: str = "") -> str:
    return llm_response

assistant = RAGAssistant("docs/", embed_fn=my_embed, generate_fn=my_generate)
answer = assistant.ask("How does authentication work?")

Offline Embedding (SentenceTransformers)

Models are downloaded automatically on first use (~90MB for default model).

from ragit import RAGAssistant
from ragit.providers import SentenceTransformersProvider

# Uses all-MiniLM-L6-v2 by default
assistant = RAGAssistant("docs/", provider=SentenceTransformersProvider())

# Or specify a model
assistant = RAGAssistant(
    "docs/",
    provider=SentenceTransformersProvider(model_name="all-mpnet-base-v2")
)

Available models: all-MiniLM-L6-v2 (384d), all-mpnet-base-v2 (768d), paraphrase-MiniLM-L6-v2 (384d)

Core API

assistant = RAGAssistant(
    documents,           # Path, list of Documents, or list of Chunks
    embed_fn=...,        # Embedding function: (str) -> list[float]
    generate_fn=...,     # LLM function: (prompt, system_prompt) -> str
    provider=...,        # Or use a provider instead of functions
    chunk_size=512,
    chunk_overlap=50
)

results = assistant.retrieve(query, top_k=3)      # [(Chunk, score), ...]
context = assistant.get_context(query, top_k=3)   # Formatted string
answer = assistant.ask(question, top_k=3)         # Requires generate_fn/LLM
code = assistant.generate_code(request)           # Requires generate_fn/LLM

Document Loading

from ragit import load_text, load_directory, chunk_text

doc = load_text("file.md")
docs = load_directory("docs/", "*.md")
chunks = chunk_text(text, chunk_size=512, chunk_overlap=50, doc_id="id")

Hyperparameter Optimization

from ragit import RagitExperiment, Document, BenchmarkQuestion

def my_embed(text: str) -> list[float]:
    return embedding_vector

def my_generate(prompt: str, system_prompt: str = "") -> str:
    return llm_response

docs = [Document(id="1", content="...")]
benchmark = [BenchmarkQuestion(question="...", ground_truth="...")]

experiment = RagitExperiment(
    docs, benchmark,
    embed_fn=my_embed,
    generate_fn=my_generate
)
results = experiment.run(max_configs=20)
print(results[0])  # Best config

License

Apache-2.0 - RODMENA LIMITED

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ragit-0.8.1.tar.gz (28.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ragit-0.8.1-py3-none-any.whl (34.4 kB view details)

Uploaded Python 3

File details

Details for the file ragit-0.8.1.tar.gz.

File metadata

  • Download URL: ragit-0.8.1.tar.gz
  • Upload date:
  • Size: 28.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for ragit-0.8.1.tar.gz
Algorithm Hash digest
SHA256 e726f31aac0194e57f6659ae53d3fc871e2e5d218eb8dba14d6d209fa5dd1127
MD5 fe02ad3b73636dd20b38c299dcf8059d
BLAKE2b-256 b06d0bf40e2696efe691efadc375ab7676b0cb22f3cdc484d8cec0bd6b1b53f4

See more details on using hashes here.

File details

Details for the file ragit-0.8.1-py3-none-any.whl.

File metadata

  • Download URL: ragit-0.8.1-py3-none-any.whl
  • Upload date:
  • Size: 34.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for ragit-0.8.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2e9ae824f9a43290dba0cd9f284a0300f79c9d313cbffb63094c258b06729a14
MD5 e09ad6eaac464d82764a3e0cfba80be5
BLAKE2b-256 d805798ed8dd741c009877c60e3b15904be239850cd762134ca3315e0e562c62

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page