BERT-based domain classifier for Denes chatbot routing (CPU-optimized)
Project description
Denes Router Classifier
HF-token: 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
BERT-based domain classifier for intelligent chatbot routing (CPU-optimized)
A lightweight Python package that uses a fine-tuned BERT multilingual model to classify user queries into three domains for the Denes chatbot system:
- Web_Search: Explicit search requests (requires Tavily API)
- TI: Technical support queries (internal routing)
- Generic: General knowledge questions (LLM-answerable, no web search needed)
Features
- Fast CPU Inference: 10-50ms latency on modern CPUs (MacBook Air M3: ~20ms)
- Singleton Pattern: Model loaded once and cached for all subsequent calls
- Multi-language Support: Based on
google-bert/bert-base-multilingual-uncased - High Accuracy: 100% accuracy on held-out test set (1500+ training examples)
- Zero Network Overhead: Direct library integration (not a microservice)
- Pydantic Types: Fully typed API with Pydantic schemas
Installation
From Local Directory (Development)
cd ../denes-backend-python # Navigate to your backend project
uv add ../denes-router-trainning/denes-router-classifier
From Git (Future)
uv add git+https://github.com/yourusername/denes-router-classifier.git
Quick Start
CLI Testing (Without Installation)
Test the package directly from the command line:
# Interactive mode (recommended for testing)
python classify_cli.py interactive
# Classify a single query
python classify_cli.py classify "Busca el clima en Asunción"
# With custom threshold
python classify_cli.py classify "No tengo internet" --threshold 0.5
# Multi-label classification
python classify_cli.py classify "Busca ayuda para mi PC" --multi-label
# Batch classification from file
echo "Busca el clima\nNo tengo internet\n¿Qué es Python?" > queries.txt
python classify_cli.py batch queries.txt
# Show version and model info
python classify_cli.py version
Interactive Mode Example:
$ python classify_cli.py interactive
Denes Router Classifier v0.1.0
Threshold: 0.7 | Type 'exit' or 'quit' to stop
Loading model... ✓ (662ms)
Query: Busca el clima en Asunción
→ Web_Search (97.8%, 36.2ms)
Query: No tengo internet
→ TI (99.8%, 35.1ms)
Query: exit
Goodbye!
Single Classification
from denes_router_classifier import classify_domain
# Explicit search query
result = classify_domain("Busca el clima en Asunción")
print(result.primary) # "Web_Search"
print(result.confidence) # 0.9984
print(result.includes_generic) # False
# Technical support query
result = classify_domain("No tengo internet")
print(result.primary) # "TI"
print(result.confidence) # 0.9964
# General knowledge question
result = classify_domain("¿Cuál es la capital de Francia?")
print(result.primary) # "Generic"
print(result.confidence) # 0.92
Multi-label Classification
# Return all domains above threshold
result = classify_domain(
"Busca ayuda para mi PC",
threshold=0.5,
multi_label=True
)
print(result.primary) # "Web_Search"
print(result.all_predictions)
# [
# {"domain": "Web_Search", "confidence": 0.78},
# {"domain": "TI", "confidence": 0.65}
# ]
Batch Classification
from denes_router_classifier import classify_batch
texts = [
"Busca el clima",
"No tengo internet",
"¿Cuál es la capital de Francia?"
]
results = classify_batch(texts, batch_size=32)
for text, result in zip(texts, results):
print(f"{text} → {result.primary} ({result.confidence:.2%})")
# Output:
# Busca el clima → Web_Search (99.84%)
# No tengo internet → TI (99.64%)
# ¿Cuál es la capital de Francia? → Generic (92.00%)
Integration with Backend
Replace LLM-based Classification
In your denes-backend-python/src/services/orchestrator.py:
from denes_router_classifier import classify_domain
# BEFORE (expensive LLM call):
# classified_domain = await domain_classifier.classify(
# message=request.message,
# current_domain=current_domain,
# history=history,
# model_name=resolved_model.name,
# )
# AFTER (fast BERT classification):
result = classify_domain(
text=request.message,
threshold=0.7,
multi_label=False
)
classified_domain = result.primary
logger.info(
"🎯 Domain classified (BERT)",
domain=classified_domain,
confidence=result.confidence,
includes_generic=result.includes_generic
)
# Optional: Fallback to LLM if confidence is too low
if result.includes_generic:
logger.warning("Low confidence, consider Generic fallback")
API Reference
classify_domain()
def classify_domain(
text: str,
threshold: float = 0.7,
multi_label: bool = False,
log_latency: bool = False
) -> ClassificationResult:
"""Classify text into domain(s) for chatbot routing.
Args:
text: Input text to classify (user query)
threshold: Confidence threshold (default: 0.7)
multi_label: Return all domains above threshold (default: False)
log_latency: Log inference time (default: False)
Returns:
ClassificationResult with primary domain, confidence, and metadata
"""
classify_batch()
def classify_batch(
texts: list[str],
threshold: float = 0.7,
multi_label: bool = False,
batch_size: int = 32
) -> list[ClassificationResult]:
"""Classify multiple texts in batch for better throughput.
Args:
texts: List of input texts
threshold: Confidence threshold (default: 0.7)
multi_label: Return all domains above threshold (default: False)
batch_size: Batch size for inference (default: 32)
Returns:
List of ClassificationResult objects
"""
ClassificationResult
class ClassificationResult(BaseModel):
primary: str # Primary domain (highest confidence)
confidence: float # Confidence score (0-1)
includes_generic: bool # True if confidence < threshold
all_predictions: Optional[list[PredictionDetail]] # Multi-label mode
Domain Definitions
Web_Search
Explicit search requests with verbs like "busca", "encuentra", "investiga", "search", "find".
Examples:
- "Busca el clima en Asunción"
- "Encuentra información sobre Python"
- "Search for the latest news"
Action: Route to Tavily API for web search (paid)
TI (Technical Support)
Technical support queries about hardware, software, network, or system issues.
Examples:
- "No tengo internet"
- "Mi computadora no enciende"
- "Error al instalar Python"
Action: Route to internal TI support system (free)
Generic
General knowledge questions answerable by the LLM without web search.
Examples:
- "¿Cuál es la capital de Francia?"
- "Explica qué es Python"
- "¿Cómo se dice 'hello' en español?"
Action: Route to LLM (OSS 120B) for direct answer (free)
Performance
Latency
- MacBook Air M3 (CPU): ~20ms per query
- Target: < 100ms on modern CPUs
- Throughput: 10+ req/s single-threaded
Accuracy (Test Set - 230 examples)
- Accuracy: 100%
- Macro F1: 1.00
- Precision: 1.00
- Recall: 1.00
Note: Test set is synthetic. Real-world performance expected: 90-98% accuracy.
Model Details
- Base Model:
google-bert/bert-base-multilingual-uncased - Parameters: 167M
- Training: 1577 examples (500+ per domain)
- Validation: 15% held-out test set
- Training Time: ~20s on MacBook Air M3 (CPU)
Deployment
HuggingFace Hub (Recommended)
The model is too large for GitHub (638MB). Use HuggingFace Hub for automatic download:
# 1. Upload model (one time)
python upload_to_hf.py # Edit USERNAME first!
# 2. Production deployment
pip install huggingface_hub
huggingface-cli login # One time only
# Model downloads automatically on first use
from denes_router_classifier import classify_domain
result = classify_domain("test") # Downloads model from HF Hub
See: HUGGINGFACE_SETUP.md for complete guide
Alternative: Git LFS
See DEPLOYMENT.md for Git LFS and other deployment options.
Requirements
- Python >= 3.10
- torch >= 2.0
- transformers >= 4.40
- pydantic >= 2.0
Development
Testing
cd denes-router-classifier
uv sync --dev
uv run pytest tests/
Linting
uv run ruff check src/
uv run ruff format src/
Cost Savings
By using BERT classification instead of LLM for routing:
- 80%+ queries correctly classified to Generic (no Tavily API cost)
- Fast inference: No waiting for LLM response for routing
- Single service: No microservice deployment complexity
License
MIT License
Contributing
This package is part of the Denes chatbot training repository: denes-router-trainning
For questions or issues, please open an issue in the main repository.
Changelog
0.1.0 (2025-11-13)
- Initial release
- BERT multilingual classifier with 3 domains
- CPU-optimized inference with singleton pattern
- Batch classification support
- Full type hints with Pydantic
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file denes_router_classifier-0.1.0.tar.gz.
File metadata
- Download URL: denes_router_classifier-0.1.0.tar.gz
- Upload date:
- Size: 24.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
caec36ca1c9119058856192f43d800ebac0344db800f654cc56173461cbfe4e6
|
|
| MD5 |
acb347880480f0a34aa4991dc7df7a41
|
|
| BLAKE2b-256 |
fa5efeb44823e207a59d278988a4299d109676c6d8ad1d92ac51ff5e09d025fa
|
File details
Details for the file denes_router_classifier-0.1.0-py3-none-any.whl.
File metadata
- Download URL: denes_router_classifier-0.1.0-py3-none-any.whl
- Upload date:
- Size: 11.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2a5877dba9487c3a492644bb996a436320611f6d870e0a7e36ff32a3cfd92a2a
|
|
| MD5 |
130b44d470de6dae433e71e69ca19847
|
|
| BLAKE2b-256 |
b903ed765a3866048a9e9fc6ed605c7af5dc5e89465e32405aa7cd27f3eba103
|