Skip to main content

End of utterance detection for LiveKit Agents

Project description

Namo Turn Detector Plugin for LiveKit Agents

Turn detection plugin for LiveKit Agents using Namo Turn Detector models.

Installation

pip install livekit-plugins-namo-turn-detector

Features

  • Single-Language Models: Memory-efficient models for Vietnamese, English, Chinese (NEW ✨)
  • Multilingual Support: 23+ languages with unified multilingual model
  • High Accuracy: Language-specific models outperform baseline models
  • Fast & Efficient: Optimized inference with 66% less memory for single-language apps
  • Async API: Built on LiveKit's inference runner for optimal performance
  • Easy Integration: Drop-in replacement for existing turn detectors

Quick Start

🎯 Single-Language Models (Recommended for Production)

Most memory-efficient option - loads only one language model (~200MB):

Vietnamese Only

from livekit.plugins import namo_turn_detector
from livekit import agents

async def entrypoint(ctx: agents.JobContext):
    model = namo_turn_detector.vi_model.VietnameseModel(threshold=0.7)
    prob = await model.predict_end_of_turn(chat_ctx)

English Only

from livekit.plugins import namo_turn_detector

async def entrypoint(ctx: agents.JobContext):
    model = namo_turn_detector.en_model.EnglishModel(threshold=0.7)
    prob = await model.predict_end_of_turn(chat_ctx)

Chinese Only

from livekit.plugins import namo_turn_detector

async def entrypoint(ctx: agents.JobContext):
    model = namo_turn_detector.zh_model.ChineseModel(threshold=0.7)
    prob = await model.predict_end_of_turn(chat_ctx)

Benefits:

  • 66% less memory (~200MB vs ~600MB)
  • 3x faster initialization
  • Highest accuracy for the language
  • ✅ Best for single-language production apps

Multi-Language Model (EN/VI/ZH Switching)

Use when you need to switch between English, Vietnamese, or Chinese:

from livekit.plugins.namo_turn_detector.language_specific import LanguageSpecificModel

# Loads all 3 models (en, vi, zh) - ~600MB
async def entrypoint(ctx: agents.JobContext):
    model = LanguageSpecificModel(language="vi", threshold=0.7)
    prob = await model.predict_end_of_turn(chat_ctx)

Multilingual Model (23+ Languages)

Use when you need support for many languages:

from livekit.plugins.namo_turn_detector.multilingual import MultilingualModel

async def entrypoint(ctx: agents.JobContext):
    model = MultilingualModel(threshold=0.7)
    prob = await model.predict_end_of_turn(chat_ctx)

Benchmark Results

Comparison across English, Vietnamese, and Chinese:

English Performance

Sample: "Hello, how are you?"
  • Namo Multilingual:     0.8757 (16ms) - EOT: True
  • Namo English-Specific: 0.0002 (13ms) - EOT: False
  • LiveKit Multilingual:  0.2838 (33ms) - EOT: True
  • LiveKit English:       0.4596 (4ms)  - EOT: True

Sample: "What's the weather like today?"
  • Namo Multilingual:     0.8032 (15ms) - EOT: True
  • Namo English-Specific: 0.9999 (9ms)  - EOT: True ⭐
  • LiveKit Multilingual:  0.7799 (27ms) - EOT: True
  • LiveKit English:       0.9409 (3ms)  - EOT: True

Vietnamese Performance

Sample: "Xin chào, bạn khỏe không?" (Hello, how are you?)
  • Namo Multilingual:        0.8651 (25ms) - EOT: True
  • Namo Vietnamese-Specific: 0.9857 (36ms) - EOT: True ⭐
  • LiveKit Multilingual:     0.0322 (20ms) - EOT: False

Sample: "Thời tiết hôm nay thế nào?" (What's the weather today?)
  • Namo Multilingual:        0.5168 (27ms) - EOT: False
  • Namo Vietnamese-Specific: 0.9952 (4ms)  - EOT: True ⭐
  • LiveKit Multilingual:     0.2988 (22ms) - EOT: False

Sample: "Vay ở đâu" (Where to borrow) - Incomplete phrase
  • Namo Multilingual:        0.6599 (20ms) - EOT: False
  • Namo Vietnamese-Specific: 0.9875 (10ms) - EOT: True ⭐
  • LiveKit Multilingual:     0.5106 (25ms) - EOT: False

Chinese Performance

Sample: "你好,你好吗?" (Hello, how are you?)
  • Namo Multilingual:     0.6525 (30ms) - EOT: False
  • Namo Chinese-Specific: 0.8777 (16ms) - EOT: True ⭐
  • LiveKit Multilingual:  0.8520 (20ms) - EOT: True

Sample: "今天天气怎么样?" (What's the weather today?)
  • Namo Multilingual:     0.6818 (18ms) - EOT: False
  • Namo Chinese-Specific: 0.9090 (34ms) - EOT: True ⭐
  • LiveKit Multilingual:  0.9707 (20ms) - EOT: True

Key Insights:

  • Language-Specific models show superior accuracy for their target languages
  • Namo Multilingual provides consistent performance across all languages
  • Inference speed is competitive, typically 10-30ms per prediction
  • Vietnamese detection significantly outperforms baseline multilingual model

API Reference

Single-Language Models (NEW ✨)

VietnameseModel

from livekit.plugins import namo_turn_detector

model = namo_turn_detector.vi_model.VietnameseModel(threshold: float = 0.7)

EnglishModel

from livekit.plugins import namo_turn_detector

model = namo_turn_detector.en_model.EnglishModel(threshold: float = 0.7)

ChineseModel

from livekit.plugins import namo_turn_detector

model = namo_turn_detector.zh_model.ChineseModel(threshold: float = 0.7)

Parameters:

  • threshold: Detection threshold (0.0-1.0), default 0.7

Properties:

  • language - Language code ("vi", "en", or "zh")
  • model - Model name (e.g., "namo-vi")
  • threshold - Current detection threshold

Methods:

  • predict_end_of_turn(chat_ctx, timeout=10.0) -> float - Returns probability (0.0-1.0)
  • unlikely_threshold(language) -> float - Get model's threshold for language

Memory Usage: ~200MB per model (loads only one language)


LanguageSpecificModel

LanguageSpecificModel(language: str, threshold: float = 0.7)

Parameters:

  • language: Language code ("en", "vi", "zh")
  • threshold: Detection threshold (0.0-1.0)

Methods:

  • predict_end_of_turn(chat_ctx, timeout=10.0) -> float - Returns probability (0.0-1.0)
  • unlikely_threshold(language) -> float - Get model's threshold for language

Memory Usage: ~600MB (loads all 3 models: en, vi, zh)


MultilingualModel

MultilingualModel(threshold: float = 0.7)

Methods:

  • predict_end_of_turn(chat_ctx, timeout=10.0) -> float - Returns probability (0.0-1.0)
  • unlikely_threshold(language) -> float - Get model's threshold for language

Memory Usage: ~400MB (single multilingual model for 23 languages)

Pre-download Models

python main.py download-files

Model Comparison

Choose the right model for your use case:

Model Languages Memory Init Speed Accuracy Best For
VietnameseModel Vietnamese ~200MB ⚡⚡⚡ Fast ⭐⭐⭐ Highest Vietnamese-only apps
EnglishModel English ~200MB ⚡⚡⚡ Fast ⭐⭐⭐ Highest English-only apps
ChineseModel Chinese ~200MB ⚡⚡⚡ Fast ⭐⭐⭐ Highest Chinese-only apps
LanguageSpecificModel EN, VI, ZH ~600MB ⚡ Slow ⭐⭐⭐ High Multi-lang apps (3 langs)
MultilingualModel 23 languages ~400MB ⚡⚡ Medium ⭐⭐ Good Global apps (many langs)

Recommendation: Use single-language models (VietnameseModel, EnglishModel, ChineseModel) for production apps serving one language. They provide 66% memory savings and 3x faster initialization.


Supported Languages

  • Single-Language Models: Vietnamese (vi), English (en), Chinese (zh)

  • Multi-Language Model (LanguageSpecificModel): English (en), Vietnamese (vi), Chinese (zh)

  • Multilingual Model (23 languages): Arabic, Bengali, Chinese, Danish, Dutch, English, Finnish, French, German, Hindi, Indonesian, Italian, Japanese, Korean, Marathi, Norwegian, Polish, Portuguese, Russian, Spanish, Turkish, Ukrainian, Vietnamese

License

Apache-2.0

Credits

Citation

@software{namo2025,
  title = {Namo Turn Detector v1: Semantic Turn Detection for Conversational AI},
  author = {VideoSDK Team},
  year = {2025},
  publisher = {Hugging Face},
  url = {https://huggingface.co/collections/videosdk-live/namo-turn-detector-v1-68d52c0564d2164e9d17ca97}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

livekit_plugins_namo_turn_detector-1.2.21.tar.gz (14.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file livekit_plugins_namo_turn_detector-1.2.21.tar.gz.

File metadata

File hashes

Hashes for livekit_plugins_namo_turn_detector-1.2.21.tar.gz
Algorithm Hash digest
SHA256 21d1d91f96e76b198d8a82835071e0aece0b2d377dac0e97dd9a153bd053f1f9
MD5 ef0fca8548771948e45673cd0eb54647
BLAKE2b-256 86e7a97006644cd1fcf35eb0e52d339eba29858c5dd39b164bfc82c7c0a1f48a

See more details on using hashes here.

File details

Details for the file livekit_plugins_namo_turn_detector-1.2.21-py3-none-any.whl.

File metadata

File hashes

Hashes for livekit_plugins_namo_turn_detector-1.2.21-py3-none-any.whl
Algorithm Hash digest
SHA256 5b137758018cfadfc049e1928a29b7a1b84cc77f76ffb6f9af717b28624164a1
MD5 ef0978d74e29f675b15a772b3b385954
BLAKE2b-256 aedcf12c3f309d3bee82f5534c8db64da7857dc128318d40670659c5c1d5949b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page