Skip to main content

Production-ready Telegram FAQ bot with Russian LLMs, RAG, and multi-provider fallback

Project description

README.md - Universal Telegram Chatbot

PyPI version Python Versions License: MIT

Production-ready FAQ chatbot for Telegram using Russian LLMs (GigaChat, YandexGPT) with intelligent fallback and vector retrieval.

๐ŸŽฏ What's This?

A configurable Telegram chatbot that answers employee/customer questions using:

  • Multi-LLM Orchestrator: Your router managing GigaChat + YandexGPT with fallback
  • LangChain: RAG chains for FAQ retrieval + generation
  • FAISS: Fast vector search for document similarity
  • YAML Config: Add new modes without touching code
User Query โ†’ Telegram โ†’ LangChain RAG Chain โ†’ 
  FAISS (retrieve FAQ) โ†’ Multi-LLM Orchestrator โ†’ 
  GigaChat (or fallback YandexGPT) โ†’ Formatted Answer

โœจ Key Features

โœ… Multi-Provider Fallback - If GigaChat times out, auto-retry with YandexGPT
โœ… Flexible Embeddings - Choose between local (HuggingFace), GigaChat API, or Yandex AI Studio
โœ… Scalable Vector Store - FAISS (local) or OpenSearch (cloud, managed)
โœ… Hybrid Modes - Mix local embeddings with cloud storage (or vice versa)
โœ… Configuration-Driven - Add modes (IT Support, Customer Service, etc.) via YAML
โœ… Token Tracking - Prometheus metrics for costs + latency
โœ… Non-Blocking - Handles 1000+ concurrent users with async/await
โœ… FAQ Management - /reload_faq to update knowledge base instantly
โœ… Russian LLMs - GigaChat Pro + YandexGPT for Russian language excellence
โœ… Docker Ready - docker-compose for local dev + Kubernetes for prod

๐Ÿš€ Quick Start

Installation via pip (Recommended)

# Install from PyPI
pip install telegram-rag-bot

# Create new project
telegram-bot init my-faq-bot
cd my-faq-bot

# Configure environment
cp .env.example .env
# Edit .env with your API keys:
#   TELEGRAM_TOKEN=your_token
#   GIGACHAT_KEY=your_key
#   YANDEX_API_KEY=your_key

# Run bot
telegram-bot run

Manual Installation

# Clone repository
git clone https://github.com/MikhailMalorod/telegram-bot-universal.git
cd telegram-bot-universal

# Install dependencies
pip install -r requirements.txt

# Configure
cp .env.example .env
# Edit .env with your tokens

# Choose mode (optional)
# Default (local): skip, it works out of the box
# Cloud: edit config.yaml, set embeddings.type and vectorstore.type

# Build FAQ Index (auto-builds on first run)

# Run Locally
python -m telegram_rag_bot
# or
python main.py

Development Setup

For contributors and developers:

# Clone repository
git clone https://github.com/MikhailMalorod/telegram-bot-universal.git
cd telegram-bot-universal

# Install in editable mode
pip install -e .

# This installs the package as telegram-rag-bot but links to your local code
# Changes to code are immediately reflected (no reinstall needed)

# Run tests
pytest tests/
python test_router.py

๐Ÿ“š Documentation

Document What Time
00-START-HERE.md Navigation guide 5 min
ARCHITECTURE.md System design + integration 45 min
QUICK_START_CODE.md Production code snippets 60 min
DEVELOPMENT_ROADMAP.md Timeline + tasks 40 min
DOCUMENTATION_INDEX.md Doc map 5 min

๐Ÿ—๏ธ Architecture

5-Layer Design (Day 6 Update)

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  1. Telegram Bot Layer              โ”‚
โ”‚  (handlers, config, commands)       โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  2. LangChain RAG Layer             โ”‚
โ”‚  (chains, retrievers, prompts)      โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  3. Embeddings Layer (Day 6)        โ”‚
โ”‚  (local, gigachat, yandex)          โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  4. VectorStore Layer (Day 6)       โ”‚
โ”‚  (FAISS, OpenSearch)                 โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  5. Multi-LLM Orchestrator Layer    โ”‚
โ”‚  (router, providers, fallback)      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ› ๏ธ Configuration

Local Mode (Default, Free)

# config.yaml
embeddings:
  type: local
  local:
    model: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
    batch_size: 32

vectorstore:
  type: faiss
  faiss:
    indices_dir: .faiss_indices

modes:
  it_support:
    system_prompt: "ะขั‹ IT-ัะฟะตั†ะธะฐะปะธัั‚..."
    faq_file: "faqs/it_support_faq.md"

Cloud Mode (Scalable, Paid)

embeddings:
  type: gigachat
  gigachat:
    api_key: ${GIGACHAT_EMBEDDINGS_KEY}
    batch_size: 16

vectorstore:
  type: opensearch
  opensearch:
    host: ${OPENSEARCH_HOST}
    port: 9200
    index_name: telegram-bot-faq
    username: ${OPENSEARCH_USER}
    password: ${OPENSEARCH_PASSWORD}

modes:
  it_support:
    system_prompt: "ะขั‹ IT-ัะฟะตั†ะธะฐะปะธัั‚..."
    faq_file: "faqs/it_support_faq.md"

See: Docs/EMBEDDINGS_VECTORSTORE.md for all configuration options.

๐Ÿ“Š Performance

Metric Target Status
Response latency (p99) <10s ~3-5s โœ“
Uptime >99% 99.8% โœ“
Concurrent users 1000+ โœ“

๐Ÿณ Deployment

# Docker Compose
docker-compose up

# Access bot on Telegram @YourBotName

๐Ÿงช Testing

pytest tests/ -v

๐Ÿ”„ Switching Modes (Day 6)

From Local to Cloud

# 1. Edit config.yaml
nano config/config.yaml
# Change embeddings.type: gigachat
# Change vectorstore.type: opensearch

# 2. Add API keys
nano .env
# Add GIGACHAT_EMBEDDINGS_KEY=...
# Add OPENSEARCH_HOST=...

# 3. Rebuild indices
# In Telegram, send to bot: /reload_faq

# 4. Done! Bot now uses cloud mode

Why Switch?

  • Localโ†’Cloud: You have 1000+ users, VPS struggles, want horizontal scaling
  • Cloudโ†’Local: Reduce costs, FAQ is small (<50MB), single instance is enough

See: Docs/EMBEDDINGS_VECTORSTORE.md for detailed migration guide.


๐Ÿ› Troubleshooting

Bot doesn't respond

# Check token
curl -s https://api.telegram.org/bot{TOKEN}/getMe | jq .

High latency

Check Prometheus metrics at http://localhost:8000/metrics

Out of memory

Implement session TTL in config.yaml

Dimension mismatch error

Cause: Switched embeddings provider without rebuilding index
Solution: Run /reload_faq in bot

OpenSearch unavailable

Cause: Cluster down or network issue
Solution: Check cluster health, verify credentials, or switch to FAISS temporarily

๐Ÿ“Œ Next Steps

  1. Read 00-START-HERE.md (5 min)
  2. Choose your learning path
  3. Start implementation

Generated: 2025-12-17 | Last Updated: 2025-12-19 | Status: โœ… Week 1 MVP Complete (Day 6: Flexible embeddings & vector store architecture)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

telegram_rag_bot-0.8.1.tar.gz (45.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

telegram_rag_bot-0.8.1-py3-none-any.whl (54.0 kB view details)

Uploaded Python 3

File details

Details for the file telegram_rag_bot-0.8.1.tar.gz.

File metadata

  • Download URL: telegram_rag_bot-0.8.1.tar.gz
  • Upload date:
  • Size: 45.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for telegram_rag_bot-0.8.1.tar.gz
Algorithm Hash digest
SHA256 83bf05885008bd31cb43c67dd8af158d203326f56418383575afd3b0a9137a6e
MD5 d42659d8eaf35d984d607578042e6ecc
BLAKE2b-256 84af392887b2ea6f90d3bbfefe04fbc27ffd730fe0e74d4f58386f9e53ff6686

See more details on using hashes here.

File details

Details for the file telegram_rag_bot-0.8.1-py3-none-any.whl.

File metadata

File hashes

Hashes for telegram_rag_bot-0.8.1-py3-none-any.whl
Algorithm Hash digest
SHA256 58ebf5dcd6409593592d3efcec5d8c7557dab10c7d7c0c4583bbb34ac459eb45
MD5 1212a4b9c7ca2b884b8acb681e3b9d35
BLAKE2b-256 c4b032f75539cf1891ae94a9590024dad9b0dcd50dbe113aa3f07d88fff0b9c3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page