A modular, production-grade Retrieval-Augmented Generation library
Project description
raglib-py
raglib-py is a production-grade Retrieval-Augmented Generation library for Python.
This README is designed to be the full user guide for PyPI users, so you do not need to go anywhere else to start.
Important package note:
- PyPI package name: raglib-py
- Python import name: raglib
Current Support Counts
- Implemented RAG strategies: 12
- Built-in chat LLM providers: 5
- Custom chat model support: yes (BaseLLMClient or LangChain-style invoke model)
The 5 built-in chat LLM providers are:
- openai
- anthropic
- groq
- ollama
What You Can Do
- Load documents from files, folders, URLs, or raw text
- Use one clean entry point: RAG(...)
- Switch between local and cloud LLM providers
- Choose embedding provider and vector store backend
- Run 12 RAG strategies out of the box
- Use interactive terminal chat mode
Installation
Install core package:
pip install raglib-py
This installs the full runtime dependency set used by core chat, embedding, vector DB, and document loader paths.
Optional extras are still available for explicit feature grouping:
# Local Ollama chat/embedding support
pip install "raglib-py[ollama]"
# Chroma vector DB backend
pip install "raglib-py[chroma]"
# DOCX/PDF/PPTX document loading
pip install "raglib-py[docx,pdf,pptx]"
# All major runtime extras
pip install "raglib-py[all]"
Quick import check:
python -c "from raglib import RAG; print('OK')"
Quickstart (Zero API Keys)
from raglib import RAG
rag = RAG("RAG improves grounded generation using retrieved context.")
result = rag.query("What does RAG improve?")
print(result.answer)
print([doc.id for doc in result.sources])
In this mode, raglib uses offline defaults automatically:
- chat model: MockLLMClient
- embeddings: MockEmbedding
Real Local Stack (Ollama)
If you have local models in Ollama, this is a strong default setup:
from raglib import RAG
rag = RAG(
source=r"C:\path\to\your\document.docx",
chat_llm="ollama",
chat_model="gemma3:4b",
embedding_llm="ollama",
embedding_model="nomic-embed-text:latest",
vector_db="chroma",
rag_type="corrective",
top_k=5,
)
result = rag.query("What is the core concept of this paper?")
print(result.answer)
Cloud Chat + Local Embeddings Example
from raglib import RAG
rag = RAG(
source="Service test content.",
chat_llm="groq",
chat_api_key="YOUR_GROQ_API_KEY",
chat_model="qwen/qwen3-32b",
embedding_llm="ollama",
embedding_model="nomic-embed-text:latest",
vector_db="memory",
rag_type="naive",
)
print(rag.query("Reply in one line that service is available.").answer)
Supported Input Sources
source accepts:
- File path: .txt, .md, .docx, .pptx, .pdf
- Folder path: recursive load
- URL: web page text extraction
- Raw text string
- List of any mix of the above
You can ingest more data later:
rag.add("new_notes.md")
rag.add("https://example.com/post")
RAG API (Main Constructor)
RAG(
source=None,
chat_llm=None,
embedding_llm=None,
vision_llm=None,
llm_key=None,
chat_api_key=None,
embedding_api_key=None,
vision_api_key=None,
rag_type="corrective",
top_k=5,
chunk_size=400,
chunk_overlap=50,
output_dir=None,
chat_model=None,
chat_base_url=None,
embedding_model=None,
embedding_base_url=None,
vision_model=None,
vision_base_url=None,
vector_db=None,
vector_db_kwargs=None,
)
API Keys And Endpoints (Important)
raglib never provides API keys. You must use your own provider credentials.
Use these fields in RAG(...):
chat_api_key: key forchat_llmproviderembedding_api_key: key forembedding_llmprovidervision_api_key: key forvision_llmproviderllm_key: one shared fallback key when you do not want to pass separate keys
Endpoint fields:
chat_base_url: custom OpenAI-compatible chat endpointembedding_base_url: custom Ollama embedding endpointvision_base_url: custom OpenAI-compatible vision endpoint
Provider key mapping:
chat_llm="openai" | "anthropic" | "groq" | "google"needs a chat keyembedding_llm="openai" | "google"needs an embedding keyvision_llm="openai" | "anthropic" | "google"needs a vision keyollama,mock, and localhuggingfacemodes do not require cloud API keys
Example with separate keys:
from raglib import RAG
rag = RAG(
source="docs/",
chat_llm="openai",
chat_api_key="YOUR_OPENAI_CHAT_KEY",
embedding_llm="google",
embedding_api_key="YOUR_GOOGLE_KEY",
vision_llm="anthropic",
vision_api_key="YOUR_ANTHROPIC_KEY",
)
Example with one shared key:
rag = RAG(
source="docs/",
chat_llm="openai",
embedding_llm="openai",
vision_llm="openai",
llm_key="YOUR_OPENAI_KEY",
)
Key methods:
- query(question): ask one question and get GenerationResult
- add(source): add more documents to existing index
- chat(): start terminal interactive Q/A session
How Many RAG Strategies Are Included?
raglib currently provides 12 built-in RAG strategies:
- naive
- advanced
- corrective
- self
- agentic
- hybrid
- multi_query
- multi_hop
- routing
- memory
- web
- tool
Use them by setting rag_type:
rag = RAG(source="docs/", rag_type="multi_hop")
When to use what:
- naive: fastest baseline
- advanced: better quality via rerank/reduce/dedup
- corrective: retries when context quality is weak
- self: decision + reflection-driven behavior
- agentic: planner-based sub-query execution
- hybrid: local + web retrieval blending
- multi_query: query variant expansion
- multi_hop: multi-step reasoning retrieval
- routing: automatic retrieval route selection
- memory: conversation-memory-aware answering
- web: web-first retrieval
- tool: retrieval plus tool output injection
Provider Support
Chat providers (5 built-in + custom adapter support):
- openai
- anthropic
- groq
- ollama
- custom BaseLLMClient or LangChain-style invoke() model
Embedding providers:
- openai
- ollama
- huggingface (aliases: free, local)
- mock
Vision providers (for scanned PDF fallback):
- openai
- anthropic
- mock
Vector backends:
- chroma (default selection)
- memory
- custom BaseVectorStore instance
Interactive Terminal Chat
from raglib import RAG
rag = RAG("my_docs/")
rag.chat()
Commands inside session:
- help
- history
- clear
- exit / quit / q
Output Saving
Set output_dir to save each query result as JSON:
rag = RAG(source="docs/", output_dir="outputs")
rag.query("Summarize this")
Troubleshooting
Import confusion:
- Install: pip install raglib-py
- Import: from raglib import RAG
Chroma issues:
- If Chroma is unavailable or unstable, use vector_db="memory"
Missing provider package:
- Install needed extra (for example: pip install "raglib-py[groq]")
Complete Local Test Script
A ready user script is included at:
- examples/local_ollama_chroma_test.py
An all-strategy comparison script is included at:
- examples/check_all_rag_types.py
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file raglib_py-0.1.7.tar.gz.
File metadata
- Download URL: raglib_py-0.1.7.tar.gz
- Upload date:
- Size: 54.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fee5220a9d62af9d72ed6f4b4b3a870fed282471e1759a4b9fc290fd5a1c5871
|
|
| MD5 |
b8df7f84cadf025d0f43f77cbdaaf9d0
|
|
| BLAKE2b-256 |
ad6b34e4d97302faa330b6809d063312579c7763ab149c460d17470e879de658
|
File details
Details for the file raglib_py-0.1.7-py3-none-any.whl.
File metadata
- Download URL: raglib_py-0.1.7-py3-none-any.whl
- Upload date:
- Size: 103.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b01970a240e4f80e506a91d6ef41b7740420a10e5e639044b82bcabac9b76243
|
|
| MD5 |
1791e98e558d1a471fba73fc36ee2417
|
|
| BLAKE2b-256 |
dd1a37be4f8a7851e3c4947c3e2ed5fcba6c0d0ee915c778e5573b8244ad696a
|