Skip to main content

End of utterance detection for LiveKit Agents

Project description

Namo Turn Detector Plugin for LiveKit Agents

Turn detection plugin for LiveKit Agents using Namo Turn Detector models.

Installation

pip install livekit-plugins-namo-turn-detector

Features

  • Single-Language Models: Memory-efficient models for Vietnamese, English, Chinese (NEW ✨)
  • Multilingual Support: 23+ languages with unified multilingual model
  • High Accuracy: Language-specific models outperform baseline models
  • Fast & Efficient: Optimized inference with 66% less memory for single-language apps
  • Async API: Built on LiveKit's inference runner for optimal performance
  • Easy Integration: Drop-in replacement for existing turn detectors

Quick Start

🎯 Single-Language Models (Recommended for Production)

Most memory-efficient option - loads only one language model (~200MB):

Vietnamese Only

from livekit.plugins import namo_turn_detector
from livekit import agents

async def entrypoint(ctx: agents.JobContext):
    model = namo_turn_detector.vi_model.VietnameseModel(threshold=0.7)
    prob = await model.predict_end_of_turn(chat_ctx)

English Only

from livekit.plugins import namo_turn_detector

async def entrypoint(ctx: agents.JobContext):
    model = namo_turn_detector.en_model.EnglishModel(threshold=0.7)
    prob = await model.predict_end_of_turn(chat_ctx)

Chinese Only

from livekit.plugins import namo_turn_detector

async def entrypoint(ctx: agents.JobContext):
    model = namo_turn_detector.zh_model.ChineseModel(threshold=0.7)
    prob = await model.predict_end_of_turn(chat_ctx)

Benefits:

  • 66% less memory (~200MB vs ~600MB)
  • 3x faster initialization
  • Highest accuracy for the language
  • ✅ Best for single-language production apps

Multi-Language Model (EN/VI/ZH Switching)

Use when you need to switch between English, Vietnamese, or Chinese:

from livekit.plugins.namo_turn_detector.language_specific import LanguageSpecificModel

# Loads all 3 models (en, vi, zh) - ~600MB
async def entrypoint(ctx: agents.JobContext):
    model = LanguageSpecificModel(language="vi", threshold=0.7)
    prob = await model.predict_end_of_turn(chat_ctx)

Multilingual Model (23+ Languages)

Use when you need support for many languages:

from livekit.plugins.namo_turn_detector.multilingual import MultilingualModel

async def entrypoint(ctx: agents.JobContext):
    model = MultilingualModel(threshold=0.7)
    prob = await model.predict_end_of_turn(chat_ctx)

Benchmark Results

Comparison across English, Vietnamese, and Chinese:

English Performance

Sample: "Hello, how are you?"
  • Namo Multilingual:     0.8757 (16ms) - EOT: True
  • Namo English-Specific: 0.0002 (13ms) - EOT: False
  • LiveKit Multilingual:  0.2838 (33ms) - EOT: True
  • LiveKit English:       0.4596 (4ms)  - EOT: True

Sample: "What's the weather like today?"
  • Namo Multilingual:     0.8032 (15ms) - EOT: True
  • Namo English-Specific: 0.9999 (9ms)  - EOT: True ⭐
  • LiveKit Multilingual:  0.7799 (27ms) - EOT: True
  • LiveKit English:       0.9409 (3ms)  - EOT: True

Vietnamese Performance

Sample: "Xin chào, bạn khỏe không?" (Hello, how are you?)
  • Namo Multilingual:        0.8651 (25ms) - EOT: True
  • Namo Vietnamese-Specific: 0.9857 (36ms) - EOT: True ⭐
  • LiveKit Multilingual:     0.0322 (20ms) - EOT: False

Sample: "Thời tiết hôm nay thế nào?" (What's the weather today?)
  • Namo Multilingual:        0.5168 (27ms) - EOT: False
  • Namo Vietnamese-Specific: 0.9952 (4ms)  - EOT: True ⭐
  • LiveKit Multilingual:     0.2988 (22ms) - EOT: False

Sample: "Vay ở đâu" (Where to borrow) - Incomplete phrase
  • Namo Multilingual:        0.6599 (20ms) - EOT: False
  • Namo Vietnamese-Specific: 0.9875 (10ms) - EOT: True ⭐
  • LiveKit Multilingual:     0.5106 (25ms) - EOT: False

Chinese Performance

Sample: "你好,你好吗?" (Hello, how are you?)
  • Namo Multilingual:     0.6525 (30ms) - EOT: False
  • Namo Chinese-Specific: 0.8777 (16ms) - EOT: True ⭐
  • LiveKit Multilingual:  0.8520 (20ms) - EOT: True

Sample: "今天天气怎么样?" (What's the weather today?)
  • Namo Multilingual:     0.6818 (18ms) - EOT: False
  • Namo Chinese-Specific: 0.9090 (34ms) - EOT: True ⭐
  • LiveKit Multilingual:  0.9707 (20ms) - EOT: True

Key Insights:

  • Language-Specific models show superior accuracy for their target languages
  • Namo Multilingual provides consistent performance across all languages
  • Inference speed is competitive, typically 10-30ms per prediction
  • Vietnamese detection significantly outperforms baseline multilingual model

API Reference

Single-Language Models (NEW ✨)

VietnameseModel

from livekit.plugins import namo_turn_detector

model = namo_turn_detector.vi_model.VietnameseModel(threshold: float = 0.7)

EnglishModel

from livekit.plugins import namo_turn_detector

model = namo_turn_detector.en_model.EnglishModel(threshold: float = 0.7)

ChineseModel

from livekit.plugins import namo_turn_detector

model = namo_turn_detector.zh_model.ChineseModel(threshold: float = 0.7)

Parameters:

  • threshold: Detection threshold (0.0-1.0), default 0.7

Properties:

  • language - Language code ("vi", "en", or "zh")
  • model - Model name (e.g., "namo-vi")
  • threshold - Current detection threshold

Methods:

  • predict_end_of_turn(chat_ctx, timeout=10.0) -> float - Returns probability (0.0-1.0)
  • unlikely_threshold(language) -> float - Get model's threshold for language

Memory Usage: ~200MB per model (loads only one language)


LanguageSpecificModel

LanguageSpecificModel(language: str, threshold: float = 0.7)

Parameters:

  • language: Language code ("en", "vi", "zh")
  • threshold: Detection threshold (0.0-1.0)

Methods:

  • predict_end_of_turn(chat_ctx, timeout=10.0) -> float - Returns probability (0.0-1.0)
  • unlikely_threshold(language) -> float - Get model's threshold for language

Memory Usage: ~600MB (loads all 3 models: en, vi, zh)


MultilingualModel

MultilingualModel(threshold: float = 0.7)

Methods:

  • predict_end_of_turn(chat_ctx, timeout=10.0) -> float - Returns probability (0.0-1.0)
  • unlikely_threshold(language) -> float - Get model's threshold for language

Memory Usage: ~400MB (single multilingual model for 23 languages)

Pre-download Models

python main.py download-files

Model Comparison

Choose the right model for your use case:

Model Languages Memory Init Speed Accuracy Best For
VietnameseModel Vietnamese ~200MB ⚡⚡⚡ Fast ⭐⭐⭐ Highest Vietnamese-only apps
EnglishModel English ~200MB ⚡⚡⚡ Fast ⭐⭐⭐ Highest English-only apps
ChineseModel Chinese ~200MB ⚡⚡⚡ Fast ⭐⭐⭐ Highest Chinese-only apps
LanguageSpecificModel EN, VI, ZH ~600MB ⚡ Slow ⭐⭐⭐ High Multi-lang apps (3 langs)
MultilingualModel 23 languages ~400MB ⚡⚡ Medium ⭐⭐ Good Global apps (many langs)

Recommendation: Use single-language models (VietnameseModel, EnglishModel, ChineseModel) for production apps serving one language. They provide 66% memory savings and 3x faster initialization.


Supported Languages

  • Single-Language Models: Vietnamese (vi), English (en), Chinese (zh)

  • Multi-Language Model (LanguageSpecificModel): English (en), Vietnamese (vi), Chinese (zh)

  • Multilingual Model (23 languages): Arabic, Bengali, Chinese, Danish, Dutch, English, Finnish, French, German, Hindi, Indonesian, Italian, Japanese, Korean, Marathi, Norwegian, Polish, Portuguese, Russian, Spanish, Turkish, Ukrainian, Vietnamese

License

Apache-2.0

Credits

Citation

@software{namo2025,
  title = {Namo Turn Detector v1: Semantic Turn Detection for Conversational AI},
  author = {VideoSDK Team},
  year = {2025},
  publisher = {Hugging Face},
  url = {https://huggingface.co/collections/videosdk-live/namo-turn-detector-v1-68d52c0564d2164e9d17ca97}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

livekit_plugins_namo_turn_detector-1.2.20.tar.gz (14.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file livekit_plugins_namo_turn_detector-1.2.20.tar.gz.

File metadata

File hashes

Hashes for livekit_plugins_namo_turn_detector-1.2.20.tar.gz
Algorithm Hash digest
SHA256 ccfcbcff320ddc843cb700908f8a38591fc0a8d33903d95533391b605be8b0d3
MD5 72af747f10412eb018c1bb20bd11f560
BLAKE2b-256 e8f3e1efb466a4276ffb06a1679e780769d3265042efc48bff4df64d37b3d56c

See more details on using hashes here.

File details

Details for the file livekit_plugins_namo_turn_detector-1.2.20-py3-none-any.whl.

File metadata

File hashes

Hashes for livekit_plugins_namo_turn_detector-1.2.20-py3-none-any.whl
Algorithm Hash digest
SHA256 5e37b2e727d51b5ed8086d611c8611176a333106de80d4ad8a986a67fdc3f489
MD5 8e9c32f135378ffa63357f81895b72dc
BLAKE2b-256 d5cb9c5ac3e4fc964b0adee71ff6f983b75d495fb7f0e14fc1c5952c62e43a26

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page