Universal LLM Training & RAG Agent for HuggingFace
Project description
license: apache-2.0
KerdosAI โ Universal LLM Training & RAG Agent
Enterprise-grade LLM Training + Retrieval-Augmented Generation (RAG) toolkit.
Fine-tune any HuggingFace model on your data, then deploy it with a full document Q&A chat interface โ locally or on cloud.
What's New in v0.2.0
- ๐
kerdosai.ragsubmodule โ full RAG pipeline (document loading โ FAISS indexing โ LLM answering) - ๐
KnowledgeBaseโ high-level API to index PDF/DOCX/TXT/MD/CSV files - ๐
RAGAgentโ streaming and blocking chat with conversation history - ๐
deployment_type="gradio-rag"โ launch the HuggingFace Space UI locally with one line - ๐
kerdosai rag-chatCLI command โ on-premise RAG UI in one command - Updated dependencies:
faiss-cpu,sentence-transformers,PyMuPDF,python-docx,gradio,huggingface-hub,tenacity
Installation
# Standard install
pip install kerdosai
# With all optional extras
pip install "kerdosai[all]"
Requirements: Python 3.8+ ยท PyTorch 2.0+ ยท CUDA-compatible GPU (recommended for training)
Quick Start
1. RAG โ Document Q&A (no GPU needed)
from kerdosai.rag import KnowledgeBase, RAGAgent
# Index your documents (PDF, DOCX, TXT, MD, CSV)
kb = KnowledgeBase()
kb.index_documents(["handbook.pdf", "faq.docx", "policy.txt"])
print(f"Indexed {kb.chunk_count} chunks from {kb.indexed_sources}")
# Chat with your documents (HF Inference API โ no GPU required)
agent = RAGAgent(hf_token="hf_...", knowledge_base=kb)
# Blocking answer
print(agent.chat("What is the leave policy?"))
# Streaming answer
for partial in agent.chat_stream("Summarise the refund section."):
print(partial, end="\r")
2. Launch the Enterprise Chat UI Locally
from kerdosai.deployer import Deployer
from kerdosai.rag import KnowledgeBase
kb = KnowledgeBase()
kb.index_documents(["report.pdf"])
deployer = Deployer(model=None, tokenizer=None)
deployer.deploy(
deployment_type="gradio-rag",
host="0.0.0.0",
port=7860,
hf_token="hf_...", # or set HF_TOKEN env var
knowledge_base=kb,
)
# โ Open http://localhost:7860
3. CLI โ One-Command RAG Chat Server
# Pre-index files and open the Gradio UI
kerdosai rag-chat \
--data handbook.pdf faq.docx policy.txt \
--hf-token hf_... \
--model meta-llama/Llama-3.1-8B-Instruct \
--port 7860
# Use HF_TOKEN env var instead of --hf-token
export HF_TOKEN=hf_...
kerdosai rag-chat --data ./company_docs/ --port 7860
4. Fine-Tune a Model (Training Pipeline)
from kerdosai import KerdosAgent
agent = KerdosAgent(
base_model="meta-llama/Llama-3.1-8B",
training_data="data/training.csv",
)
metrics = agent.train(epochs=3, batch_size=4, learning_rate=2e-5)
agent.save("./my-finetuned-model")
print(metrics)
5. Deploy Fine-Tuned Model as REST API
from kerdosai import KerdosAgent
agent = KerdosAgent.load("./my-finetuned-model")
agent.deploy(deployment_type="rest", host="0.0.0.0", port=8000)
# POST /generate โ {"text": "...", "max_length": 200}
RAG Module Reference
KnowledgeBase
from kerdosai.rag import KnowledgeBase
kb = KnowledgeBase(
embedding_model="BAAI/bge-small-en-v1.5", # SentenceTransformer model
chunk_size=512, # Max chars per chunk
chunk_overlap=64, # Overlap between chunks
min_score=0.30, # Min cosine similarity threshold
)
kb.index_documents(["doc1.pdf", "doc2.docx"]) # Add documents (duplicate-safe)
kb.index_documents(["doc3.txt"]) # Incrementally add more
results = kb.search("What is the refund policy?", top_k=5)
# โ [{"source": "doc1.pdf", "text": "...", "score": 0.87}, ...]
print(kb.chunk_count) # Total indexed chunks
print(kb.indexed_sources) # Set of indexed filenames
kb.clear() # Reset index
RAGAgent
from kerdosai.rag import RAGAgent
agent = RAGAgent(
hf_token="hf_...", # HF API token
model="meta-llama/Llama-3.1-8B-Instruct", # LLM model ID
top_k=5, # Chunks retrieved per query
embedding_model="BAAI/bge-small-en-v1.5", # Embedding model
)
agent.index(["report.pdf", "handbook.docx"]) # Index documents
# Blocking
reply = agent.chat("What are the payment terms?")
# Streaming
for partial in agent.chat_stream("Summarise section 3."):
print(partial, end="\r")
agent.reset_history() # Clear conversation
Low-Level RAG API
from kerdosai.rag import (
load_documents, # Parse files โ [{"source", "text"}]
build_index, # List[dict] โ VectorIndex
add_to_index, # Incrementally extend a VectorIndex
retrieve, # VectorIndex + query โ top-K chunks
answer_stream, # Chunks + HF token โ streaming generator
answer, # Chunks + HF token โ full string
VectorIndex, # Dataclass: chunks + faiss index + embedder
)
docs = load_documents(["report.pdf", "policy.docx"])
index = build_index(docs, embedding_model="BAAI/bge-small-en-v1.5")
chunks = retrieve("What is the refund policy?", index, top_k=5)
response = answer("What is the refund policy?", chunks, hf_token="hf_...")
Supported file types: .pdf (PyMuPDF) ยท .docx (python-docx, incl. tables) ยท .txt ยท .md ยท .csv
Training API Reference
KerdosAgent
| Method | Description |
|---|---|
KerdosAgent(base_model, training_data, device=None) |
Initialize with a HuggingFace model ID and path to training data |
.train(epochs, batch_size, learning_rate, **kwargs) |
Fine-tune the model; returns metrics dict |
.deploy(deployment_type, host, port, **kwargs) |
Deploy as REST / Docker / Kubernetes / Gradio-RAG |
.save(output_dir) |
Save model + tokenizer to disk |
.load(model_dir) |
Class method โ load a saved model |
Trainer
| Method | Description |
|---|---|
Trainer(model, tokenizer, device, use_wandb=True) |
Initialize trainer |
.train(dataset, epochs, batch_size, learning_rate, ...) |
Run HuggingFace training loop |
.evaluate(dataset, batch_size) |
Evaluate and return eval_loss |
DataProcessor
| Method | Description |
|---|---|
DataProcessor(data_path, max_length=512, text_column="text") |
Initialize |
.prepare_dataset(tokenizer=None) |
Load, clean, tokenize โ HuggingFace Dataset |
.validate_data() |
Check data quality and print warnings |
Supported training data formats: .csv (with text column) ยท .json (list of objects with text key)
CLI Reference
kerdosai <command> [options]
Commands:
train Fine-tune a model on custom data
deploy Deploy a trained model
rag-chat Launch a RAG document Q&A UI (no local GPU needed)
kerdosai train
kerdosai train \
--model meta-llama/Llama-3.1-8B \
--data ./data/training.csv \
--output ./my-model \
--epochs 3 \
--batch-size 4 \
--learning-rate 2e-5
kerdosai deploy
# REST API
kerdosai deploy --model-dir ./my-model --type rest --port 8000
# Gradio RAG UI (with a fine-tuned local model)
kerdosai deploy --model-dir ./my-model --type gradio-rag --hf-token hf_... --port 7860
# Docker container
kerdosai deploy --model-dir ./my-model --type docker
kerdosai rag-chat
kerdosai rag-chat \
--data company.pdf handbook.docx faq.txt \
--hf-token hf_... \
--model meta-llama/Llama-3.1-8B-Instruct \
--embedding-model BAAI/bge-small-en-v1.5 \
--host 0.0.0.0 \
--port 7860
Deployment Options
| Type | Command | Description |
|---|---|---|
rest |
deploy(deployment_type="rest") |
FastAPI REST API on POST /generate |
gradio-rag |
deploy(deployment_type="gradio-rag") |
Full Kerdos RAG Chat UI (Gradio) |
docker |
deploy(deployment_type="docker") |
Build & run Docker container |
kubernetes |
deploy(deployment_type="kubernetes") |
Generate K8s YAML manifests |
Architecture
kerdosai/
โโโ __init__.py # KerdosAgent, Trainer, Deployer, DataProcessor, KnowledgeBase, RAGAgent
โโโ agent.py # KerdosAgent โ orchestrates training + deployment
โโโ trainer.py # Trainer โ HuggingFace training loop + W&B logging
โโโ deployer.py # Deployer โ REST / Gradio-RAG / Docker / Kubernetes
โโโ data_processor.py # DataProcessor โ CSV/JSON loading, tokenization
โโโ cli.py # CLI โ train / deploy / rag-chat commands
โโโ rag/
โโโ __init__.py # Public RAG API surface
โโโ document_loader.py # PDF / DOCX / TXT / MD / CSV parser
โโโ embedder.py # FAISS + SentenceTransformer index builder
โโโ retriever.py # Top-K cosine-similarity retrieval
โโโ chain.py # HF Inference API streaming + blocking answer
โโโ knowledge_base.py # KnowledgeBase high-level class
โโโ rag_agent.py # RAGAgent โ chat + history management
HuggingFace Space Demo
Try the live RAG demo at ๐ huggingface.co/spaces/kerdosdotio/Custom-LLM-Chat
Upload any PDF, DOCX, or TXT file and ask questions. The AI answers only from your documents โ never from outside knowledge.
The full
kerdosaipackage lets you run this same UI privately on your own servers withkerdosai rag-chat.
Environment Variables
| Variable | Description | Default |
|---|---|---|
HF_TOKEN |
HuggingFace API token for inference | โ |
LLM_MODEL |
LLM model ID for generation | meta-llama/Llama-3.1-8B-Instruct |
Real-World Applications
Healthcare
- Clinical documentation automation
- Patient Q&A from medical records
- HIPAA-compliant private deployment
Financial Services
- Policy & compliance Q&A
- Risk report summarisation
- Private on-premise data security
Legal
- Contract review and clause extraction
- Case research from uploaded precedents
- Confidential document handling
Enterprise Internal Tools
- HR handbook chatbot
- IT knowledge base
- Onboarding document Q&A
Requirements
Python >= 3.8
torch >= 2.0.0
transformers >= 4.30.0
faiss-cpu >= 1.7.4
sentence-transformers >= 2.2.2
PyMuPDF >= 1.22.5
python-docx >= 0.8.11
gradio >= 4.0.0
huggingface-hub >= 0.28.0
tenacity >= 8.2.0
Full list in requirements.txt.
Contributing
We welcome contributions! See CONTRIBUTING.md for guidelines.
License
MIT License โ see LICENSE for details.
About Kerdos Infrasoft
Kerdos Infrasoft Private Limited ยท CIN: U62099KA2023PTC182869
๐ kerdos.in ยท ๐ฌ partnership@kerdos.in
We are actively seeking investment & partnerships to build the fully customisable enterprise edition โ including private LLM hosting, custom model fine-tuning, data privacy guarantees, and white-label deployments.
Citation
@software{kerdosai2024,
title = {KerdosAI: Universal LLM Training & RAG Agent},
author = {Kerdos Infrasoft Private Limited},
year = {2024},
url = {https://github.com/bhaskarvilles/kerdosai},
note = {v0.2.1}
}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kerdosai-0.2.1.tar.gz.
File metadata
- Download URL: kerdosai-0.2.1.tar.gz
- Upload date:
- Size: 29.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9b03c54d06bbceaf1dd852f6d7b34aada6db382915c6af23ab53138e09e2b837
|
|
| MD5 |
0a2e184056374743128d527a155307f7
|
|
| BLAKE2b-256 |
8da2adee576bf814e349a78fcf7d36c28e011b1c3b5bc119d79063ecd5ee0d82
|
File details
Details for the file kerdosai-0.2.1-py3-none-any.whl.
File metadata
- Download URL: kerdosai-0.2.1-py3-none-any.whl
- Upload date:
- Size: 29.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f83e45ed49862743aa5b49a6256d1227e2bbc22cc6a566cf60b95fc5f4d118f2
|
|
| MD5 |
120a00e7f460562e59304a97def52dcc
|
|
| BLAKE2b-256 |
a8be30a45fdbdce27b961dcead41f95c6ffaded9a90bca1bbe36b6e492b12a2f
|