Production-ready RAG framework built on LangChain — multi-tenant chatbots with streaming, tool calling, agent mode, FAISS vector search, and persistent MongoDB memory
Project description
LongTrainer 1.3.0 — Production-Ready RAG Framework
Multi-tenant bots, streaming, tools, and persistent memory — all batteries included.
Documentation • Quick Start • Features • Migration from 0.3.4 • Sponsor
What is LongTrainer?
LongTrainer is a production-ready RAG framework that turns your documents into intelligent, multi-tenant chatbots — with 5 lines of code.
Built on top of LangChain, LongTrainer handles the hard parts that every production RAG system needs: multi-bot isolation, persistent MongoDB memory, FAISS vector search, streaming responses, custom tool calling, chat encryption, and vision support — so you don't have to wire them together yourself.
Why LongTrainer over raw LangChain / LlamaIndex?
| Problem | LangChain / LlamaIndex | LongTrainer |
|---|---|---|
| Multi-bot management | DIY — manage state per bot | Built-in: initialize_bot_id() → isolated bots |
| Persistent chat memory | Wire MongoDB/Redis yourself | Built-in: MongoDB-backed, encrypted, restorable |
| Document ingestion | Assemble loaders + splitters | One-liner: add_document_from_path(path, bot_id) |
| Streaming responses | Implement astream yourself |
get_response(stream=True) yields chunks |
| Custom tool calling | Define tools, build agent | add_tool(my_tool) — plug and play |
| Web search augmentation | Find and integrate search | Built-in toggle: web_search=True |
| Vision chat | Complex multi-modal setup | get_vision_response() — pass images |
| Self-improving from chats | Not a concept | train_chats() feeds Q&A back into KB |
| Encryption at rest | DIY | encrypt_chats=True — Fernet out of the box |
Installation
pip install longtrainer
With agent/tool-calling support (optional):
pip install longtrainer[agent]
With observability & hallucination detection (optional):
pip install longtrainer[tracer]
System Dependencies
Linux (Ubuntu/Debian)
sudo apt install libmagic-dev poppler-utils tesseract-ocr qpdf libreoffice pandoc
macOS
brew install libmagic poppler tesseract qpdf libreoffice pandoc
Quick Start 🚀
1. Zero-Code CLI & API Server (New in 1.2.3!)
Manage bots, chat, and run a production API directly from your terminal—no Python required.
A. Interactive Terminal Chat
# 1. Initialize a new project and generate longtrainer.yaml
longtrainer init
# 2. Create a new bot
longtrainer bot create --prompt "You are a helpful assistant."
# 3. Add a document (PDF, link, etc.)
longtrainer add-doc <bot_id> /path/to/document.pdf
# 4. Start chatting!
longtrainer chat <bot_id>
B. FastAPI REST Server
Start a production-ready API server backed by your LongTrainer bots:
longtrainer serve
This starts a FastAPI server running on http://localhost:8000 with 18 REST endpoints, including:
/health/bots(CRUD)/bots/{id}/documents/path(Ingest files)/bots/{id}/chats(Create sessions)/bots/{id}/chats/{chat_id}(Chat and Streaming)
Visit http://localhost:8000/docs to see the auto-generated Swagger UI and test the API directly!
2. Python SDK — Default RAG Mode
from longtrainer.trainer import LongTrainer
import os
os.environ["OPENAI_API_KEY"] = "sk-..."
# Initialize
trainer = LongTrainer(mongo_endpoint="mongodb://localhost:27017/")
bot_id = trainer.initialize_bot_id()
# Add documents (PDF, DOCX, CSV, HTML, MD, TXT, URLs, YouTube, Wikipedia)
trainer.add_document_from_path("path/to/your/data.pdf", bot_id)
# Create bot and start chatting
trainer.create_bot(bot_id)
chat_id = trainer.new_chat(bot_id)
# Get response
answer, sources = trainer.get_response("What is this document about?", bot_id, chat_id)
print(answer)
Streaming Responses
# Stream tokens in real-time
for chunk in trainer.get_response("Summarize the key points", bot_id, chat_id, stream=True):
print(chunk, end="", flush=True)
Async Streaming
async for chunk in trainer.aget_response("Explain the methodology", bot_id, chat_id):
print(chunk, end="", flush=True)
AgentBot automatically routes questions to tools like web search when necessary.
🌟 NEW: Dynamic ZERO CODE Tools
LongTrainer V2 now integrates LangChain's massive dynamic tool ecosystem natively:
trainer.create_bot(
"agent-id",
agent_mode=True,
tools=["tavily_search_results_json", "wikipedia", "arxiv", "PythonREPLTool", "yahoo_finance_news"]
)
LongTrainer will dynamically import and initialize ANY string-based tool from langchain.agents.load_tools natively on the backend!
You may still register custom tools globally or per-bot explicitly:
from langchain.tools import tool
@tool
def get_weather(location: str):
Agent Mode — With Custom Tools
from longtrainer.tools import web_search
from langchain_core.tools import tool
# Add built-in web search tool
trainer.add_tool(web_search, bot_id)
# Add your own custom tool
@tool
def calculate(expression: str) -> str:
"""Evaluate a math expression."""
return str(eval(expression))
trainer.add_tool(calculate, bot_id)
# Create bot in agent mode
trainer.create_bot(bot_id, agent_mode=True)
chat_id = trainer.new_chat(bot_id)
response, _ = trainer.get_response("What is 42 * 17?", bot_id, chat_id)
print(response)
Vision Chat
vision_id = trainer.new_vision_chat(bot_id)
response, sources = trainer.get_vision_response(
"Describe what you see in this image",
image_paths=["photo.jpg"],
bot_id=bot_id,
vision_chat_id=vision_id,
)
print(response)
Per-Bot Customization
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
# Each bot can have its own LLM, embeddings, and retrieval config
trainer.create_bot(
bot_id,
llm=ChatOpenAI(model="gpt-4o-mini", temperature=0.2),
embedding_model=OpenAIEmbeddings(model="text-embedding-3-small"),
num_k=5, # retrieve 5 docs per query
prompt_template="You are a helpful legal assistant. {context}",
agent_mode=True, # enable tool calling
tools=[web_search],
)
Features ✨
Core
- ✅ Dual Mode: RAG (LCEL chain) for simple Q&A, Agent (LangGraph) for tool calling
- ✅ Streaming Responses: Sync and async streaming out of the box
- ✅ Custom Tool Calling: Add any LangChain
@tool— web search, document reader, or your own - ✅ Multi-Bot Management: Isolated bots with independent sessions, data, and configs
- ✅ Persistent Memory: MongoDB-backed chat history, fully restorable
- ✅ Chat Encryption: Fernet encryption for stored conversations
- ✅ Observability & Tracing: Native integration with LongTracer for logging spans and hallucination detection (
pip install longtrainer[tracer])
Document Ingestion
- ✅ Standard Formats: PDF, DOCX, CSV, HTML, Markdown, TXT
- ✅ Web & Crawling:
add_document_from_link(),add_document_from_query(),add_document_from_crawl() - ✅ Cloud & Enterprise: S3 (
add_document_from_aws_s3), Google Drive (add_document_from_google_drive), Confluence (add_document_from_confluence) - ✅ Structued Data: Local Directory (
add_document_from_directory), JSON & JQ (add_document_from_json), GitHub Repo (add_document_from_github) - ✅ Dynamic Integrations: Inject ANY LangChain document loader class dynamically via
add_document_from_dynamic_loader()
RAG Pipeline & Vector DBs
- ✅ Vector Databases: FAISS, Pinecone, Chroma, Qdrant, PGVector, MongoDB Atlas, Milvus, Elasticsearch, Weaviate
- ✅ Multi-Query Ensemble Retrieval: Generates alternative queries for better recall
- ✅ Self-Improving Memory:
train_chats()feeds past Q&A back into the knowledge base
Customization
- ✅ Per-bot LLM — use different models for different bots
- ✅ Per-bot Embeddings — custom embedding models per bot
- ✅ Per-bot Retrieval Config — custom
num_k,chunk_size,chunk_overlap - ✅ Custom Prompt Templates — full control over system prompts
- ✅ Vision Chat — GPT-4 Vision support with image understanding
Works with All LangChain-Compatible LLMs
- ✅ OpenAI (default)
- ✅ Anthropic
- ✅ Google VertexAI / Gemini
- ✅ AWS Bedrock
- ✅ HuggingFace
- ✅ Groq
- ✅ Together AI
- ✅ Ollama (local models)
- ✅ Any
BaseChatModelimplementation
API Reference
LongTrainer — Main Class
trainer = LongTrainer(
mongo_endpoint="mongodb://localhost:27017/",
llm=None, # default: ChatOpenAI(model="gpt-4o-2024-08-06")
embedding_model=None, # default: OpenAIEmbeddings()
prompt_template=None, # custom system prompt
max_token_limit=32000, # conversation memory limit
num_k=3, # docs to retrieve per query
chunk_size=2048, # text splitter chunk size
chunk_overlap=200, # text splitter overlap
ensemble=False, # enable multi-query ensemble retrieval
encrypt_chats=False, # enable Fernet encryption
encryption_key=None, # custom encryption key (auto-generated if None)
enable_tracer=False, # enable LongTracer observability
tracer_backend="mongo", # tracer backend ('mongo', 'sqlite', 'memory')
tracer_verify=True, # run CitationVerifier (hallucination detection)
tracer_verbose=False, # print tracer spans to console
tracer_threshold=0.5, # confidence threshold for tracer (0-1)
)
Key Methods
| Method | Description |
|---|---|
initialize_bot_id() |
Create a new bot, returns bot_id |
create_bot(bot_id, ...) |
Build the bot from loaded documents |
load_bot(bot_id) |
Restore an existing bot from MongoDB + FAISS |
new_chat(bot_id) |
Start a new chat session, returns chat_id |
get_response(query, bot_id, chat_id, stream=False) |
Get response (or stream) |
aget_response(query, bot_id, chat_id) |
Async streaming response |
add_document_from_path(path, bot_id) |
Ingest a file |
add_document_from_link(links, bot_id) |
Ingest URLs / YouTube links |
add_tool(tool, bot_id) |
Register a tool for a bot |
remove_tool(tool_name, bot_id) |
Remove a tool |
list_tools(bot_id) |
List registered tools |
train_chats(bot_id) |
Self-improve from chat history |
new_vision_chat(bot_id) |
Start a vision chat session |
get_vision_response(query, images, bot_id, vision_id) |
Vision response |
Migration from 0.3.4
LongTrainer 1.0.0 is a major upgrade with breaking changes:
| 0.3.4 | 1.0.0 |
|---|---|
ConversationalRetrievalChain |
LCEL chain (RAGBot) or LangGraph agent (AgentBot) |
requirements.txt + setup.py |
pyproject.toml (UV/pip compatible) |
| No streaming | stream=True or aget_response() |
| No tool calling | add_tool() + agent_mode=True |
langchain.memory |
langchain_core.chat_history |
| Fixed LLM for all bots | Per-bot LLM, embeddings, and config |
Upgrade path:
pip install --upgrade longtrainer
The core API (initialize_bot_id, create_bot, new_chat, get_response) remains the same — existing code should work with minimal changes. The main difference is get_response() now returns (answer, sources) instead of (answer, sources, web_sources).
Citation
@misc{longtrainer,
author = {Endevsols},
title = {LongTrainer: Production-Ready RAG Framework},
year = {2024},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/ENDEVSOLS/Long-Trainer}},
}
License
Contributing
We welcome contributions! See CONTRIBUTING.md for guidelines.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file longtrainer-1.3.0.tar.gz.
File metadata
- Download URL: longtrainer-1.3.0.tar.gz
- Upload date:
- Size: 741.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
917aece3e2691cd081af23e88447f26b3cca449adbdbf69817d32b24f191c842
|
|
| MD5 |
484442dcab231e5e3cb63c219b1e3b0c
|
|
| BLAKE2b-256 |
dcd0ad76824d95fb5d4d57ebd0392e07b749f176661b52a4281b3aa896efdb2e
|
File details
Details for the file longtrainer-1.3.0-py3-none-any.whl.
File metadata
- Download URL: longtrainer-1.3.0-py3-none-any.whl
- Upload date:
- Size: 58.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6da35c8c0d1870a9f56fec89dd4116c3e026afbc9c245bfa842d673cd2e06c74
|
|
| MD5 |
ab464e15d4de471d969302a88f728052
|
|
| BLAKE2b-256 |
037677267d0b6c268a4cdc09bb99878646c59390b493d594f18cc2e48cace893
|