Skip to main content

A modular, production-grade Retrieval-Augmented Generation library

Project description

raglib-py

raglib-py is a production-grade Retrieval-Augmented Generation library for Python.

This README is designed to be the full user guide for PyPI users, so you do not need to go anywhere else to start.

Important package note:

  • PyPI package name: raglib-py
  • Python import name: raglib

Current Support Counts

  • Implemented RAG strategies: 12
  • Built-in chat LLM providers: 5
  • Custom chat model support: yes (BaseLLMClient or LangChain-style invoke model)

The 5 built-in chat LLM providers are:

  • openai
  • anthropic
  • groq
  • google
  • ollama

What You Can Do

  • Load documents from files, folders, URLs, or raw text
  • Use one clean entry point: RAG(...)
  • Switch between local and cloud LLM providers
  • Choose embedding provider and vector store backend
  • Run 12 RAG strategies out of the box
  • Use interactive terminal chat mode

Installation

Install core package:

pip install raglib-py

This installs the full runtime dependency set used by core chat, embedding, vector DB, and document loader paths.

Optional extras are still available for explicit feature grouping:

# Local Ollama chat/embedding support
pip install "raglib-py[ollama]"

# Chroma vector DB backend
pip install "raglib-py[chroma]"

# DOCX/PDF/PPTX document loading
pip install "raglib-py[docx,pdf,pptx]"

# All major runtime extras
pip install "raglib-py[all]"

Quick import check:

python -c "from raglib import RAG; print('OK')"

Quickstart (Zero API Keys)

from raglib import RAG

rag = RAG("RAG improves grounded generation using retrieved context.")
result = rag.query("What does RAG improve?")

print(result.answer)
print([doc.id for doc in result.sources])

In this mode, raglib uses offline defaults automatically:

  • chat model: MockLLMClient
  • embeddings: MockEmbedding

Real Local Stack (Ollama)

If you have local models in Ollama, this is a strong default setup:

from raglib import RAG

rag = RAG(
	source=r"C:\path\to\your\document.docx",
	chat_llm="ollama",
	chat_model="gemma3:4b",
	embedding_llm="ollama",
	embedding_model="nomic-embed-text:latest",
	vector_db="chroma",
	rag_type="corrective",
	top_k=5,
)

result = rag.query("What is the core concept of this paper?")
print(result.answer)

Cloud Chat + Local Embeddings Example

from raglib import RAG

rag = RAG(
	source="Service test content.",
	chat_llm="groq",
	chat_api_key="YOUR_GROQ_API_KEY",
	chat_model="qwen/qwen3-32b",
	embedding_llm="ollama",
	embedding_model="nomic-embed-text:latest",
	vector_db="memory",
	rag_type="naive",
)

print(rag.query("Reply in one line that service is available.").answer)

Supported Input Sources

source accepts:

  • File path: .txt, .md, .docx, .pptx, .pdf
  • Folder path: recursive load
  • URL: web page text extraction
  • Raw text string
  • List of any mix of the above

You can ingest more data later:

rag.add("new_notes.md")
rag.add("https://example.com/post")

RAG API (Main Constructor)

RAG(
	source=None,
	chat_llm=None,
	embedding_llm=None,
	vision_llm=None,
	llm_key=None,
	chat_api_key=None,
	embedding_api_key=None,
	vision_api_key=None,
	rag_type="corrective",
	top_k=5,
	chunk_size=400,
	chunk_overlap=50,
	output_dir=None,
	chat_model=None,
	chat_base_url=None,
	embedding_model=None,
	embedding_base_url=None,
	vision_model=None,
	vision_base_url=None,
	vector_db=None,
	vector_db_kwargs=None,
)

API Keys And Endpoints (Important)

raglib never provides API keys. You must use your own provider credentials.

Use these fields in RAG(...):

  • chat_api_key: key for chat_llm provider
  • embedding_api_key: key for embedding_llm provider
  • vision_api_key: key for vision_llm provider
  • llm_key: one shared fallback key when you do not want to pass separate keys

Endpoint fields:

  • chat_base_url: custom OpenAI-compatible chat endpoint
  • embedding_base_url: custom Ollama embedding endpoint
  • vision_base_url: custom OpenAI-compatible vision endpoint

Provider key mapping:

  • chat_llm="openai" | "anthropic" | "groq" | "google" needs a chat key
  • embedding_llm="openai" | "google" needs an embedding key
  • vision_llm="openai" | "anthropic" | "google" needs a vision key
  • ollama, mock, and local huggingface modes do not require cloud API keys

Example with separate keys:

from raglib import RAG

rag = RAG(
	source="docs/",
	chat_llm="openai",
	chat_api_key="YOUR_OPENAI_CHAT_KEY",
	embedding_llm="google",
	embedding_api_key="YOUR_GOOGLE_KEY",
	vision_llm="anthropic",
	vision_api_key="YOUR_ANTHROPIC_KEY",
)

Example with one shared key:

rag = RAG(
	source="docs/",
	chat_llm="openai",
	embedding_llm="openai",
	vision_llm="openai",
	llm_key="YOUR_OPENAI_KEY",
)

Key methods:

  • query(question): ask one question and get GenerationResult
  • add(source): add more documents to existing index
  • chat(): start terminal interactive Q/A session

How Many RAG Strategies Are Included?

raglib currently provides 12 built-in RAG strategies:

  1. naive
  2. advanced
  3. corrective
  4. self
  5. agentic
  6. hybrid
  7. multi_query
  8. multi_hop
  9. routing
  10. memory
  11. web
  12. tool

Use them by setting rag_type:

rag = RAG(source="docs/", rag_type="multi_hop")

When to use what:

  • naive: fastest baseline
  • advanced: better quality via rerank/reduce/dedup
  • corrective: retries when context quality is weak
  • self: decision + reflection-driven behavior
  • agentic: planner-based sub-query execution
  • hybrid: local + web retrieval blending
  • multi_query: query variant expansion
  • multi_hop: multi-step reasoning retrieval
  • routing: automatic retrieval route selection
  • memory: conversation-memory-aware answering
  • web: web-first retrieval
  • tool: retrieval plus tool output injection

Provider Support

Chat providers (5 built-in + custom adapter support):

  • openai
  • anthropic
  • groq
  • google
  • ollama
  • custom BaseLLMClient or LangChain-style invoke() model

Embedding providers:

  • openai
  • google
  • ollama
  • huggingface (aliases: free, local)
  • mock

Vision providers (for scanned PDF fallback):

  • openai
  • anthropic
  • google
  • mock

Vector backends:

  • chroma (default selection)
  • memory
  • custom BaseVectorStore instance

Interactive Terminal Chat

from raglib import RAG

rag = RAG("my_docs/")
rag.chat()

Commands inside session:

  • help
  • history
  • clear
  • exit / quit / q

Output Saving

Set output_dir to save each query result as JSON:

rag = RAG(source="docs/", output_dir="outputs")
rag.query("Summarize this")

Troubleshooting

Import confusion:

  • Install: pip install raglib-py
  • Import: from raglib import RAG

Chroma issues:

  • If Chroma is unavailable or unstable, use vector_db="memory"

Missing provider package:

  • Install needed extra (for example: pip install "raglib-py[groq]")

Complete Local Test Script

A ready user script is included at:

  • examples/local_ollama_chroma_test.py

An all-strategy comparison script is included at:

  • examples/check_all_rag_types.py

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

raglib_py-0.1.7.tar.gz (54.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

raglib_py-0.1.7-py3-none-any.whl (103.1 kB view details)

Uploaded Python 3

File details

Details for the file raglib_py-0.1.7.tar.gz.

File metadata

  • Download URL: raglib_py-0.1.7.tar.gz
  • Upload date:
  • Size: 54.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for raglib_py-0.1.7.tar.gz
Algorithm Hash digest
SHA256 fee5220a9d62af9d72ed6f4b4b3a870fed282471e1759a4b9fc290fd5a1c5871
MD5 b8df7f84cadf025d0f43f77cbdaaf9d0
BLAKE2b-256 ad6b34e4d97302faa330b6809d063312579c7763ab149c460d17470e879de658

See more details on using hashes here.

File details

Details for the file raglib_py-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: raglib_py-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 103.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for raglib_py-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 b01970a240e4f80e506a91d6ef41b7740420a10e5e639044b82bcabac9b76243
MD5 1791e98e558d1a471fba73fc36ee2417
BLAKE2b-256 dd1a37be4f8a7851e3c4947c3e2ed5fcba6c0d0ee915c778e5573b8244ad696a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page