Universal LLM Training & RAG Agent for HuggingFace

These details have not been verified by PyPI

Project description

license: apache-2.0

KerdosAI — Universal LLM Training & RAG Agent

Enterprise-grade LLM Training + Retrieval-Augmented Generation (RAG) toolkit.
Fine-tune any HuggingFace model on your data, then deploy it with a full document Q&A chat interface — locally or on cloud.

What's New in v0.2.0

🆕 kerdosai.rag submodule — full RAG pipeline (document loading → FAISS indexing → LLM answering)
🆕 KnowledgeBase — high-level API to index PDF/DOCX/TXT/MD/CSV files
🆕 RAGAgent — streaming and blocking chat with conversation history
🆕 deployment_type="gradio-rag" — launch the HuggingFace Space UI locally with one line
🆕 kerdosai rag-chat CLI command — on-premise RAG UI in one command
Updated dependencies: faiss-cpu, sentence-transformers, PyMuPDF, python-docx, gradio, huggingface-hub, tenacity

Installation

# Standard install
pip install kerdosai

# With all optional extras
pip install "kerdosai[all]"

Requirements: Python 3.8+ · PyTorch 2.0+ · CUDA-compatible GPU (recommended for training)

Quick Start

1. RAG — Document Q&A (no GPU needed)

from kerdosai.rag import KnowledgeBase, RAGAgent

# Index your documents (PDF, DOCX, TXT, MD, CSV)
kb = KnowledgeBase()
kb.index_documents(["handbook.pdf", "faq.docx", "policy.txt"])
print(f"Indexed {kb.chunk_count} chunks from {kb.indexed_sources}")

# Chat with your documents (HF Inference API — no GPU required)
agent = RAGAgent(hf_token="hf_...", knowledge_base=kb)

# Blocking answer
print(agent.chat("What is the leave policy?"))

# Streaming answer
for partial in agent.chat_stream("Summarise the refund section."):
    print(partial, end="\r")

2. Launch the Enterprise Chat UI Locally

from kerdosai.deployer import Deployer
from kerdosai.rag import KnowledgeBase

kb = KnowledgeBase()
kb.index_documents(["report.pdf"])

deployer = Deployer(model=None, tokenizer=None)
deployer.deploy(
    deployment_type="gradio-rag",
    host="0.0.0.0",
    port=7860,
    hf_token="hf_...",       # or set HF_TOKEN env var
    knowledge_base=kb,
)
# → Open http://localhost:7860

3. CLI — One-Command RAG Chat Server

# Pre-index files and open the Gradio UI
kerdosai rag-chat \
  --data handbook.pdf faq.docx policy.txt \
  --hf-token hf_... \
  --model meta-llama/Llama-3.1-8B-Instruct \
  --port 7860

# Use HF_TOKEN env var instead of --hf-token
export HF_TOKEN=hf_...
kerdosai rag-chat --data ./company_docs/ --port 7860

4. Fine-Tune a Model (Training Pipeline)

from kerdosai import KerdosAgent

agent = KerdosAgent(
    base_model="meta-llama/Llama-3.1-8B",
    training_data="data/training.csv",
)

metrics = agent.train(epochs=3, batch_size=4, learning_rate=2e-5)
agent.save("./my-finetuned-model")
print(metrics)

5. Deploy Fine-Tuned Model as REST API

from kerdosai import KerdosAgent

agent = KerdosAgent.load("./my-finetuned-model")
agent.deploy(deployment_type="rest", host="0.0.0.0", port=8000)
# POST /generate → {"text": "...", "max_length": 200}

RAG Module Reference

`KnowledgeBase`

from kerdosai.rag import KnowledgeBase

kb = KnowledgeBase(
    embedding_model="BAAI/bge-small-en-v1.5",  # SentenceTransformer model
    chunk_size=512,                              # Max chars per chunk
    chunk_overlap=64,                            # Overlap between chunks
    min_score=0.30,                              # Min cosine similarity threshold
)

kb.index_documents(["doc1.pdf", "doc2.docx"])   # Add documents (duplicate-safe)
kb.index_documents(["doc3.txt"])                 # Incrementally add more

results = kb.search("What is the refund policy?", top_k=5)
# → [{"source": "doc1.pdf", "text": "...", "score": 0.87}, ...]

print(kb.chunk_count)       # Total indexed chunks
print(kb.indexed_sources)   # Set of indexed filenames
kb.clear()                  # Reset index

`RAGAgent`

from kerdosai.rag import RAGAgent

agent = RAGAgent(
    hf_token="hf_...",                              # HF API token
    model="meta-llama/Llama-3.1-8B-Instruct",       # LLM model ID
    top_k=5,                                         # Chunks retrieved per query
    embedding_model="BAAI/bge-small-en-v1.5",        # Embedding model
)

agent.index(["report.pdf", "handbook.docx"])         # Index documents

# Blocking
reply = agent.chat("What are the payment terms?")

# Streaming
for partial in agent.chat_stream("Summarise section 3."):
    print(partial, end="\r")

agent.reset_history()                                # Clear conversation

Low-Level RAG API

from kerdosai.rag import (
    load_documents,    # Parse files → [{"source", "text"}]
    build_index,       # List[dict] → VectorIndex
    add_to_index,      # Incrementally extend a VectorIndex
    retrieve,          # VectorIndex + query → top-K chunks
    answer_stream,     # Chunks + HF token → streaming generator
    answer,            # Chunks + HF token → full string
    VectorIndex,       # Dataclass: chunks + faiss index + embedder
)

docs     = load_documents(["report.pdf", "policy.docx"])
index    = build_index(docs, embedding_model="BAAI/bge-small-en-v1.5")
chunks   = retrieve("What is the refund policy?", index, top_k=5)
response = answer("What is the refund policy?", chunks, hf_token="hf_...")

Supported file types: .pdf (PyMuPDF) · .docx (python-docx, incl. tables) · .txt · .md · .csv

Training API Reference

`KerdosAgent`

Method	Description
`KerdosAgent(base_model, training_data, device=None)`	Initialize with a HuggingFace model ID and path to training data
`.train(epochs, batch_size, learning_rate, **kwargs)`	Fine-tune the model; returns metrics dict
`.deploy(deployment_type, host, port, **kwargs)`	Deploy as REST / Docker / Kubernetes / Gradio-RAG
`.save(output_dir)`	Save model + tokenizer to disk
`.load(model_dir)`	Class method — load a saved model

`Trainer`

Method	Description
`Trainer(model, tokenizer, device, use_wandb=True)`	Initialize trainer
`.train(dataset, epochs, batch_size, learning_rate, ...)`	Run HuggingFace training loop
`.evaluate(dataset, batch_size)`	Evaluate and return `eval_loss`

`DataProcessor`

Method	Description
`DataProcessor(data_path, max_length=512, text_column="text")`	Initialize
`.prepare_dataset(tokenizer=None)`	Load, clean, tokenize → HuggingFace `Dataset`
`.validate_data()`	Check data quality and print warnings

Supported training data formats: .csv (with text column) · .json (list of objects with text key)

CLI Reference

kerdosai <command> [options]

Commands:
  train        Fine-tune a model on custom data
  deploy       Deploy a trained model
  rag-chat     Launch a RAG document Q&A UI (no local GPU needed)

`kerdosai train`

kerdosai train \
  --model meta-llama/Llama-3.1-8B \
  --data ./data/training.csv \
  --output ./my-model \
  --epochs 3 \
  --batch-size 4 \
  --learning-rate 2e-5

`kerdosai deploy`

# REST API
kerdosai deploy --model-dir ./my-model --type rest --port 8000

# Gradio RAG UI (with a fine-tuned local model)
kerdosai deploy --model-dir ./my-model --type gradio-rag --hf-token hf_... --port 7860

# Docker container
kerdosai deploy --model-dir ./my-model --type docker

`kerdosai rag-chat`

kerdosai rag-chat \
  --data company.pdf handbook.docx faq.txt \
  --hf-token hf_... \
  --model meta-llama/Llama-3.1-8B-Instruct \
  --embedding-model BAAI/bge-small-en-v1.5 \
  --host 0.0.0.0 \
  --port 7860

Deployment Options

Type	Command	Description
`rest`	`deploy(deployment_type="rest")`	FastAPI REST API on `POST /generate`
`gradio-rag`	`deploy(deployment_type="gradio-rag")`	Full Kerdos RAG Chat UI (Gradio)
`docker`	`deploy(deployment_type="docker")`	Build & run Docker container
`kubernetes`	`deploy(deployment_type="kubernetes")`	Generate K8s YAML manifests

Architecture

kerdosai/
├── __init__.py          # KerdosAgent, Trainer, Deployer, DataProcessor, KnowledgeBase, RAGAgent
├── agent.py             # KerdosAgent — orchestrates training + deployment
├── trainer.py           # Trainer — HuggingFace training loop + W&B logging
├── deployer.py          # Deployer — REST / Gradio-RAG / Docker / Kubernetes
├── data_processor.py    # DataProcessor — CSV/JSON loading, tokenization
├── cli.py               # CLI — train / deploy / rag-chat commands
└── rag/
    ├── __init__.py          # Public RAG API surface
    ├── document_loader.py   # PDF / DOCX / TXT / MD / CSV parser
    ├── embedder.py          # FAISS + SentenceTransformer index builder
    ├── retriever.py         # Top-K cosine-similarity retrieval
    ├── chain.py             # HF Inference API streaming + blocking answer
    ├── knowledge_base.py    # KnowledgeBase high-level class
    └── rag_agent.py         # RAGAgent — chat + history management

HuggingFace Space Demo

Try the live RAG demo at 👉 huggingface.co/spaces/kerdosdotio/Custom-LLM-Chat

Upload any PDF, DOCX, or TXT file and ask questions. The AI answers only from your documents — never from outside knowledge.

The full kerdosai package lets you run this same UI privately on your own servers with kerdosai rag-chat.

Environment Variables

Variable	Description	Default
`HF_TOKEN`	HuggingFace API token for inference	—
`LLM_MODEL`	LLM model ID for generation	`meta-llama/Llama-3.1-8B-Instruct`

Real-World Applications

Healthcare

Clinical documentation automation
Patient Q&A from medical records
HIPAA-compliant private deployment

Financial Services

Policy & compliance Q&A
Risk report summarisation
Private on-premise data security

Legal

Contract review and clause extraction
Case research from uploaded precedents
Confidential document handling

Enterprise Internal Tools

HR handbook chatbot
IT knowledge base
Onboarding document Q&A

Requirements

Python >= 3.8
torch >= 2.0.0
transformers >= 4.30.0
faiss-cpu >= 1.7.4
sentence-transformers >= 2.2.2
PyMuPDF >= 1.22.5
python-docx >= 0.8.11
gradio >= 4.0.0
huggingface-hub >= 0.28.0
tenacity >= 8.2.0

Full list in requirements.txt.

Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

License

MIT License — see LICENSE for details.

About Kerdos Infrasoft

Kerdos Infrasoft Private Limited · CIN: U62099KA2023PTC182869
🌐 kerdos.in · 📬 partnership@kerdos.in

We are actively seeking investment & partnerships to build the fully customisable enterprise edition — including private LLM hosting, custom model fine-tuning, data privacy guarantees, and white-label deployments.

Citation

@software{kerdosai2024,
  title  = {KerdosAI: Universal LLM Training & RAG Agent},
  author = {Kerdos Infrasoft Private Limited},
  year   = {2024},
  url    = {https://github.com/bhaskarvilles/kerdosai},
  note   = {v0.2.1}
}

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.1

Mar 3, 2026

0.2.0

Mar 3, 2026

0.1.1

Apr 2, 2025

0.1.0

Apr 2, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kerdosai-0.2.1.tar.gz (29.0 kB view details)

Uploaded Mar 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

kerdosai-0.2.1-py3-none-any.whl (29.4 kB view details)

Uploaded Mar 3, 2026 Python 3

File details

Details for the file kerdosai-0.2.1.tar.gz.

File metadata

Download URL: kerdosai-0.2.1.tar.gz
Upload date: Mar 3, 2026
Size: 29.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for kerdosai-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`9b03c54d06bbceaf1dd852f6d7b34aada6db382915c6af23ab53138e09e2b837`
MD5	`0a2e184056374743128d527a155307f7`
BLAKE2b-256	`8da2adee576bf814e349a78fcf7d36c28e011b1c3b5bc119d79063ecd5ee0d82`

See more details on using hashes here.

File details

Details for the file kerdosai-0.2.1-py3-none-any.whl.

File metadata

Download URL: kerdosai-0.2.1-py3-none-any.whl
Upload date: Mar 3, 2026
Size: 29.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for kerdosai-0.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f83e45ed49862743aa5b49a6256d1227e2bbc22cc6a566cf60b95fc5f4d118f2`
MD5	`120a00e7f460562e59304a97def52dcc`
BLAKE2b-256	`a8be30a45fdbdce27b961dcead41f95c6ffaded9a90bca1bbe36b6e492b12a2f`

See more details on using hashes here.

kerdosai 0.2.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

license: apache-2.0

KerdosAI — Universal LLM Training & RAG Agent

What's New in v0.2.0

Installation

Quick Start

1. RAG — Document Q&A (no GPU needed)

2. Launch the Enterprise Chat UI Locally

3. CLI — One-Command RAG Chat Server

4. Fine-Tune a Model (Training Pipeline)

5. Deploy Fine-Tuned Model as REST API

RAG Module Reference

KnowledgeBase

RAGAgent

Low-Level RAG API

Training API Reference

KerdosAgent

Trainer

DataProcessor

CLI Reference

kerdosai train

kerdosai deploy

kerdosai rag-chat

Deployment Options

Architecture

HuggingFace Space Demo

Environment Variables

Real-World Applications

Healthcare

Financial Services

Legal

Enterprise Internal Tools

Requirements

Contributing

License

About Kerdos Infrasoft

Citation

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`KnowledgeBase`

`RAGAgent`

`KerdosAgent`

`Trainer`

`DataProcessor`

`kerdosai train`

`kerdosai deploy`

`kerdosai rag-chat`