Skip to main content

Automatic RAG Pattern Optimization Engine

Project description

ragit

RAG toolkit for Python. Document loading, chunking, vector search, LLM integration.

Installation

pip install ragit

# For offline embedding
pip install ragit[transformers]

Quick Start

You must provide an embedding source: custom function, SentenceTransformers, or any provider.

Custom Embedding Function

from ragit import RAGAssistant

def my_embed(text: str) -> list[float]:
    # Use any embedding API: OpenAI, Cohere, HuggingFace, etc.
    return embedding_vector

assistant = RAGAssistant("docs/", embed_fn=my_embed)
results = assistant.retrieve("search query")

With LLM for Q&A

def my_embed(text: str) -> list[float]:
    return embedding_vector

def my_generate(prompt: str, system_prompt: str = "") -> str:
    return llm_response

assistant = RAGAssistant("docs/", embed_fn=my_embed, generate_fn=my_generate)
answer = assistant.ask("How does authentication work?")

Offline Embedding (SentenceTransformers)

Models are downloaded automatically on first use (~90MB for default model).

from ragit import RAGAssistant
from ragit.providers import SentenceTransformersProvider

# Uses all-MiniLM-L6-v2 by default
assistant = RAGAssistant("docs/", provider=SentenceTransformersProvider())

# Or specify a model
assistant = RAGAssistant(
    "docs/",
    provider=SentenceTransformersProvider(model_name="all-mpnet-base-v2")
)

Available models: all-MiniLM-L6-v2 (384d), all-mpnet-base-v2 (768d), paraphrase-MiniLM-L6-v2 (384d)

Core API

assistant = RAGAssistant(
    documents,           # Path, list of Documents, or list of Chunks
    embed_fn=...,        # Embedding function: (str) -> list[float]
    generate_fn=...,     # LLM function: (prompt, system_prompt) -> str
    provider=...,        # Or use a provider instead of functions
    chunk_size=512,
    chunk_overlap=50
)

results = assistant.retrieve(query, top_k=3)      # [(Chunk, score), ...]
context = assistant.get_context(query, top_k=3)   # Formatted string
answer = assistant.ask(question, top_k=3)         # Requires generate_fn/LLM
code = assistant.generate_code(request)           # Requires generate_fn/LLM

Document Loading

from ragit import load_text, load_directory, chunk_text

doc = load_text("file.md")
docs = load_directory("docs/", "*.md")
chunks = chunk_text(text, chunk_size=512, chunk_overlap=50, doc_id="id")

Hyperparameter Optimization

from ragit import RagitExperiment, Document, BenchmarkQuestion

def my_embed(text: str) -> list[float]:
    return embedding_vector

def my_generate(prompt: str, system_prompt: str = "") -> str:
    return llm_response

docs = [Document(id="1", content="...")]
benchmark = [BenchmarkQuestion(question="...", ground_truth="...")]

experiment = RagitExperiment(
    docs, benchmark,
    embed_fn=my_embed,
    generate_fn=my_generate
)
results = experiment.run(max_configs=20)
print(results[0])  # Best config

License

Apache-2.0 - RODMENA LIMITED

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ragit-0.8.2.tar.gz (30.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ragit-0.8.2-py3-none-any.whl (35.2 kB view details)

Uploaded Python 3

File details

Details for the file ragit-0.8.2.tar.gz.

File metadata

  • Download URL: ragit-0.8.2.tar.gz
  • Upload date:
  • Size: 30.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for ragit-0.8.2.tar.gz
Algorithm Hash digest
SHA256 ac324fa74135032e9add83b5284cc6cc29b34ed008171a11721963759325f081
MD5 0ac10db79ccfda6a490759a2d6ebd8c6
BLAKE2b-256 a73144a6f5cbd508914d2857a81fa6f47038101314f4ebc8ee11d1f1ee85db5a

See more details on using hashes here.

File details

Details for the file ragit-0.8.2-py3-none-any.whl.

File metadata

  • Download URL: ragit-0.8.2-py3-none-any.whl
  • Upload date:
  • Size: 35.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for ragit-0.8.2-py3-none-any.whl
Algorithm Hash digest
SHA256 9fa2223f9fa618986c992dcd8cec46f4c9348dc0cbcaaccd56c5df90ce12525f
MD5 3988490c985da9c506d27d0144a7c2f0
BLAKE2b-256 fed220b1830c1005a55360e996c0c72580d72f606e70ab83c96d73e49346ebd7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page