Skip to main content

BERT-based domain classifier for Denes chatbot routing (CPU-optimized)

Project description

Denes Router Classifier

HF-token: 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

BERT-based domain classifier for intelligent chatbot routing (CPU-optimized)

A lightweight Python package that uses a fine-tuned BERT multilingual model to classify user queries into three domains for the Denes chatbot system:

  • Web_Search: Explicit search requests (requires Tavily API)
  • TI: Technical support queries (internal routing)
  • Generic: General knowledge questions (LLM-answerable, no web search needed)

Features

  • Fast CPU Inference: 10-50ms latency on modern CPUs (MacBook Air M3: ~20ms)
  • Singleton Pattern: Model loaded once and cached for all subsequent calls
  • Multi-language Support: Based on google-bert/bert-base-multilingual-uncased
  • High Accuracy: 100% accuracy on held-out test set (1500+ training examples)
  • Zero Network Overhead: Direct library integration (not a microservice)
  • Pydantic Types: Fully typed API with Pydantic schemas

Installation

From Local Directory (Development)

cd ../denes-backend-python  # Navigate to your backend project
uv add ../denes-router-trainning/denes-router-classifier

From Git (Future)

uv add git+https://github.com/yourusername/denes-router-classifier.git

Quick Start

CLI Testing (Without Installation)

Test the package directly from the command line:

# Interactive mode (recommended for testing)
python classify_cli.py interactive

# Classify a single query
python classify_cli.py classify "Busca el clima en Asunción"

# With custom threshold
python classify_cli.py classify "No tengo internet" --threshold 0.5

# Multi-label classification
python classify_cli.py classify "Busca ayuda para mi PC" --multi-label

# Batch classification from file
echo "Busca el clima\nNo tengo internet\n¿Qué es Python?" > queries.txt
python classify_cli.py batch queries.txt

# Show version and model info
python classify_cli.py version

Interactive Mode Example:

$ python classify_cli.py interactive

Denes Router Classifier v0.1.0
Threshold: 0.7 | Type 'exit' or 'quit' to stop

Loading model... ✓ (662ms)

Query: Busca el clima en Asunción
  → Web_Search (97.8%, 36.2ms)

Query: No tengo internet
  → TI (99.8%, 35.1ms)

Query: exit

Goodbye!

Single Classification

from denes_router_classifier import classify_domain

# Explicit search query
result = classify_domain("Busca el clima en Asunción")
print(result.primary)       # "Web_Search"
print(result.confidence)    # 0.9984
print(result.includes_generic)  # False

# Technical support query
result = classify_domain("No tengo internet")
print(result.primary)       # "TI"
print(result.confidence)    # 0.9964

# General knowledge question
result = classify_domain("¿Cuál es la capital de Francia?")
print(result.primary)       # "Generic"
print(result.confidence)    # 0.92

Multi-label Classification

# Return all domains above threshold
result = classify_domain(
    "Busca ayuda para mi PC",
    threshold=0.5,
    multi_label=True
)

print(result.primary)  # "Web_Search"
print(result.all_predictions)
# [
#   {"domain": "Web_Search", "confidence": 0.78},
#   {"domain": "TI", "confidence": 0.65}
# ]

Batch Classification

from denes_router_classifier import classify_batch

texts = [
    "Busca el clima",
    "No tengo internet",
    "¿Cuál es la capital de Francia?"
]

results = classify_batch(texts, batch_size=32)

for text, result in zip(texts, results):
    print(f"{text}{result.primary} ({result.confidence:.2%})")

# Output:
# Busca el clima → Web_Search (99.84%)
# No tengo internet → TI (99.64%)
# ¿Cuál es la capital de Francia? → Generic (92.00%)

Integration with Backend

Replace LLM-based Classification

In your denes-backend-python/src/services/orchestrator.py:

from denes_router_classifier import classify_domain

# BEFORE (expensive LLM call):
# classified_domain = await domain_classifier.classify(
#     message=request.message,
#     current_domain=current_domain,
#     history=history,
#     model_name=resolved_model.name,
# )

# AFTER (fast BERT classification):
result = classify_domain(
    text=request.message,
    threshold=0.7,
    multi_label=False
)
classified_domain = result.primary

logger.info(
    "🎯 Domain classified (BERT)",
    domain=classified_domain,
    confidence=result.confidence,
    includes_generic=result.includes_generic
)

# Optional: Fallback to LLM if confidence is too low
if result.includes_generic:
    logger.warning("Low confidence, consider Generic fallback")

API Reference

classify_domain()

def classify_domain(
    text: str,
    threshold: float = 0.7,
    multi_label: bool = False,
    log_latency: bool = False
) -> ClassificationResult:
    """Classify text into domain(s) for chatbot routing.

    Args:
        text: Input text to classify (user query)
        threshold: Confidence threshold (default: 0.7)
        multi_label: Return all domains above threshold (default: False)
        log_latency: Log inference time (default: False)

    Returns:
        ClassificationResult with primary domain, confidence, and metadata
    """

classify_batch()

def classify_batch(
    texts: list[str],
    threshold: float = 0.7,
    multi_label: bool = False,
    batch_size: int = 32
) -> list[ClassificationResult]:
    """Classify multiple texts in batch for better throughput.

    Args:
        texts: List of input texts
        threshold: Confidence threshold (default: 0.7)
        multi_label: Return all domains above threshold (default: False)
        batch_size: Batch size for inference (default: 32)

    Returns:
        List of ClassificationResult objects
    """

ClassificationResult

class ClassificationResult(BaseModel):
    primary: str               # Primary domain (highest confidence)
    confidence: float          # Confidence score (0-1)
    includes_generic: bool     # True if confidence < threshold
    all_predictions: Optional[list[PredictionDetail]]  # Multi-label mode

Domain Definitions

Web_Search

Explicit search requests with verbs like "busca", "encuentra", "investiga", "search", "find".

Examples:

  • "Busca el clima en Asunción"
  • "Encuentra información sobre Python"
  • "Search for the latest news"

Action: Route to Tavily API for web search (paid)

TI (Technical Support)

Technical support queries about hardware, software, network, or system issues.

Examples:

  • "No tengo internet"
  • "Mi computadora no enciende"
  • "Error al instalar Python"

Action: Route to internal TI support system (free)

Generic

General knowledge questions answerable by the LLM without web search.

Examples:

  • "¿Cuál es la capital de Francia?"
  • "Explica qué es Python"
  • "¿Cómo se dice 'hello' en español?"

Action: Route to LLM (OSS 120B) for direct answer (free)

Performance

Latency

  • MacBook Air M3 (CPU): ~20ms per query
  • Target: < 100ms on modern CPUs
  • Throughput: 10+ req/s single-threaded

Accuracy (Test Set - 230 examples)

  • Accuracy: 100%
  • Macro F1: 1.00
  • Precision: 1.00
  • Recall: 1.00

Note: Test set is synthetic. Real-world performance expected: 90-98% accuracy.

Model Details

  • Base Model: google-bert/bert-base-multilingual-uncased
  • Parameters: 167M
  • Training: 1577 examples (500+ per domain)
  • Validation: 15% held-out test set
  • Training Time: ~20s on MacBook Air M3 (CPU)

Deployment

HuggingFace Hub (Recommended)

The model is too large for GitHub (638MB). Use HuggingFace Hub for automatic download:

# 1. Upload model (one time)
python upload_to_hf.py  # Edit USERNAME first!

# 2. Production deployment
pip install huggingface_hub
huggingface-cli login  # One time only

# Model downloads automatically on first use
from denes_router_classifier import classify_domain
result = classify_domain("test")  # Downloads model from HF Hub

See: HUGGINGFACE_SETUP.md for complete guide

Alternative: Git LFS

See DEPLOYMENT.md for Git LFS and other deployment options.

Requirements

  • Python >= 3.10
  • torch >= 2.0
  • transformers >= 4.40
  • pydantic >= 2.0

Development

Testing

cd denes-router-classifier
uv sync --dev
uv run pytest tests/

Linting

uv run ruff check src/
uv run ruff format src/

Cost Savings

By using BERT classification instead of LLM for routing:

  • 80%+ queries correctly classified to Generic (no Tavily API cost)
  • Fast inference: No waiting for LLM response for routing
  • Single service: No microservice deployment complexity

License

MIT License

Contributing

This package is part of the Denes chatbot training repository: denes-router-trainning

For questions or issues, please open an issue in the main repository.

Changelog

0.1.0 (2025-11-13)

  • Initial release
  • BERT multilingual classifier with 3 domains
  • CPU-optimized inference with singleton pattern
  • Batch classification support
  • Full type hints with Pydantic

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

denes_router_classifier-0.1.0.tar.gz (24.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

denes_router_classifier-0.1.0-py3-none-any.whl (11.4 kB view details)

Uploaded Python 3

File details

Details for the file denes_router_classifier-0.1.0.tar.gz.

File metadata

  • Download URL: denes_router_classifier-0.1.0.tar.gz
  • Upload date:
  • Size: 24.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for denes_router_classifier-0.1.0.tar.gz
Algorithm Hash digest
SHA256 caec36ca1c9119058856192f43d800ebac0344db800f654cc56173461cbfe4e6
MD5 acb347880480f0a34aa4991dc7df7a41
BLAKE2b-256 fa5efeb44823e207a59d278988a4299d109676c6d8ad1d92ac51ff5e09d025fa

See more details on using hashes here.

File details

Details for the file denes_router_classifier-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for denes_router_classifier-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2a5877dba9487c3a492644bb996a436320611f6d870e0a7e36ff32a3cfd92a2a
MD5 130b44d470de6dae433e71e69ca19847
BLAKE2b-256 b903ed765a3866048a9e9fc6ed605c7af5dc5e89465e32405aa7cd27f3eba103

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page