Skip to main content

End of utterance detection for LiveKit Agents

Project description

Namo Turn Detector Plugin for LiveKit Agents

Turn detection plugin for LiveKit Agents using Namo Turn Detector models.

Installation

pip install livekit-plugins-namo-turn-detector

Features

  • Single-Language Models: Memory-efficient models for Vietnamese, English, Chinese (NEW ✨)
  • Multilingual Support: 23+ languages with unified multilingual model
  • High Accuracy: Language-specific models outperform baseline models
  • Fast & Efficient: Optimized inference with 66% less memory for single-language apps
  • Async API: Built on LiveKit's inference runner for optimal performance
  • Easy Integration: Drop-in replacement for existing turn detectors

Quick Start

🎯 Single-Language Models (Recommended for Production)

Most memory-efficient option - loads only one language model (~200MB):

Vietnamese Only

from livekit.plugins import namo_turn_detector
from livekit import agents

async def entrypoint(ctx: agents.JobContext):
    model = namo_turn_detector.vi_model.VietnameseModel(threshold=0.7)
    prob = await model.predict_end_of_turn(chat_ctx)

English Only

from livekit.plugins import namo_turn_detector

async def entrypoint(ctx: agents.JobContext):
    model = namo_turn_detector.en_model.EnglishModel(threshold=0.7)
    prob = await model.predict_end_of_turn(chat_ctx)

Chinese Only

from livekit.plugins import namo_turn_detector

async def entrypoint(ctx: agents.JobContext):
    model = namo_turn_detector.zh_model.ChineseModel(threshold=0.7)
    prob = await model.predict_end_of_turn(chat_ctx)

Benefits:

  • 66% less memory (~200MB vs ~600MB)
  • 3x faster initialization
  • Highest accuracy for the language
  • ✅ Best for single-language production apps

Multi-Language Model (EN/VI/ZH Switching)

Use when you need to switch between English, Vietnamese, or Chinese:

from livekit.plugins.namo_turn_detector.language_specific import LanguageSpecificModel

# Loads all 3 models (en, vi, zh) - ~600MB
async def entrypoint(ctx: agents.JobContext):
    model = LanguageSpecificModel(language="vi", threshold=0.7)
    prob = await model.predict_end_of_turn(chat_ctx)

Multilingual Model (23+ Languages)

Use when you need support for many languages:

from livekit.plugins.namo_turn_detector.multilingual import MultilingualModel

async def entrypoint(ctx: agents.JobContext):
    model = MultilingualModel(threshold=0.7)
    prob = await model.predict_end_of_turn(chat_ctx)

Benchmark Results

Comparison across English, Vietnamese, and Chinese:

English Performance

Sample: "Hello, how are you?"
  • Namo Multilingual:     0.8757 (16ms) - EOT: True
  • Namo English-Specific: 0.0002 (13ms) - EOT: False
  • LiveKit Multilingual:  0.2838 (33ms) - EOT: True
  • LiveKit English:       0.4596 (4ms)  - EOT: True

Sample: "What's the weather like today?"
  • Namo Multilingual:     0.8032 (15ms) - EOT: True
  • Namo English-Specific: 0.9999 (9ms)  - EOT: True ⭐
  • LiveKit Multilingual:  0.7799 (27ms) - EOT: True
  • LiveKit English:       0.9409 (3ms)  - EOT: True

Vietnamese Performance

Sample: "Xin chào, bạn khỏe không?" (Hello, how are you?)
  • Namo Multilingual:        0.8651 (25ms) - EOT: True
  • Namo Vietnamese-Specific: 0.9857 (36ms) - EOT: True ⭐
  • LiveKit Multilingual:     0.0322 (20ms) - EOT: False

Sample: "Thời tiết hôm nay thế nào?" (What's the weather today?)
  • Namo Multilingual:        0.5168 (27ms) - EOT: False
  • Namo Vietnamese-Specific: 0.9952 (4ms)  - EOT: True ⭐
  • LiveKit Multilingual:     0.2988 (22ms) - EOT: False

Sample: "Vay ở đâu" (Where to borrow) - Incomplete phrase
  • Namo Multilingual:        0.6599 (20ms) - EOT: False
  • Namo Vietnamese-Specific: 0.9875 (10ms) - EOT: True ⭐
  • LiveKit Multilingual:     0.5106 (25ms) - EOT: False

Chinese Performance

Sample: "你好,你好吗?" (Hello, how are you?)
  • Namo Multilingual:     0.6525 (30ms) - EOT: False
  • Namo Chinese-Specific: 0.8777 (16ms) - EOT: True ⭐
  • LiveKit Multilingual:  0.8520 (20ms) - EOT: True

Sample: "今天天气怎么样?" (What's the weather today?)
  • Namo Multilingual:     0.6818 (18ms) - EOT: False
  • Namo Chinese-Specific: 0.9090 (34ms) - EOT: True ⭐
  • LiveKit Multilingual:  0.9707 (20ms) - EOT: True

Key Insights:

  • Language-Specific models show superior accuracy for their target languages
  • Namo Multilingual provides consistent performance across all languages
  • Inference speed is competitive, typically 10-30ms per prediction
  • Vietnamese detection significantly outperforms baseline multilingual model

API Reference

Single-Language Models (NEW ✨)

VietnameseModel

from livekit.plugins import namo_turn_detector

model = namo_turn_detector.vi_model.VietnameseModel(threshold: float = 0.7)

EnglishModel

from livekit.plugins import namo_turn_detector

model = namo_turn_detector.en_model.EnglishModel(threshold: float = 0.7)

ChineseModel

from livekit.plugins import namo_turn_detector

model = namo_turn_detector.zh_model.ChineseModel(threshold: float = 0.7)

Parameters:

  • threshold: Detection threshold (0.0-1.0), default 0.7

Properties:

  • language - Language code ("vi", "en", or "zh")
  • model - Model name (e.g., "namo-vi")
  • threshold - Current detection threshold

Methods:

  • predict_end_of_turn(chat_ctx, timeout=10.0) -> float - Returns probability (0.0-1.0)
  • unlikely_threshold(language) -> float - Get model's threshold for language

Memory Usage: ~200MB per model (loads only one language)


LanguageSpecificModel

LanguageSpecificModel(language: str, threshold: float = 0.7)

Parameters:

  • language: Language code ("en", "vi", "zh")
  • threshold: Detection threshold (0.0-1.0)

Methods:

  • predict_end_of_turn(chat_ctx, timeout=10.0) -> float - Returns probability (0.0-1.0)
  • unlikely_threshold(language) -> float - Get model's threshold for language

Memory Usage: ~600MB (loads all 3 models: en, vi, zh)


MultilingualModel

MultilingualModel(threshold: float = 0.7)

Methods:

  • predict_end_of_turn(chat_ctx, timeout=10.0) -> float - Returns probability (0.0-1.0)
  • unlikely_threshold(language) -> float - Get model's threshold for language

Memory Usage: ~400MB (single multilingual model for 23 languages)

Pre-download Models

python main.py download-files

Model Comparison

Choose the right model for your use case:

Model Languages Memory Init Speed Accuracy Best For
VietnameseModel Vietnamese ~200MB ⚡⚡⚡ Fast ⭐⭐⭐ Highest Vietnamese-only apps
EnglishModel English ~200MB ⚡⚡⚡ Fast ⭐⭐⭐ Highest English-only apps
ChineseModel Chinese ~200MB ⚡⚡⚡ Fast ⭐⭐⭐ Highest Chinese-only apps
LanguageSpecificModel EN, VI, ZH ~600MB ⚡ Slow ⭐⭐⭐ High Multi-lang apps (3 langs)
MultilingualModel 23 languages ~400MB ⚡⚡ Medium ⭐⭐ Good Global apps (many langs)

Recommendation: Use single-language models (VietnameseModel, EnglishModel, ChineseModel) for production apps serving one language. They provide 66% memory savings and 3x faster initialization.


Supported Languages

  • Single-Language Models: Vietnamese (vi), English (en), Chinese (zh)

  • Multi-Language Model (LanguageSpecificModel): English (en), Vietnamese (vi), Chinese (zh)

  • Multilingual Model (23 languages): Arabic, Bengali, Chinese, Danish, Dutch, English, Finnish, French, German, Hindi, Indonesian, Italian, Japanese, Korean, Marathi, Norwegian, Polish, Portuguese, Russian, Spanish, Turkish, Ukrainian, Vietnamese

License

Apache-2.0

Credits

Citation

@software{namo2025,
  title = {Namo Turn Detector v1: Semantic Turn Detection for Conversational AI},
  author = {VideoSDK Team},
  year = {2025},
  publisher = {Hugging Face},
  url = {https://huggingface.co/collections/videosdk-live/namo-turn-detector-v1-68d52c0564d2164e9d17ca97}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

livekit_plugins_namo_turn_detector-1.2.22.tar.gz (14.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file livekit_plugins_namo_turn_detector-1.2.22.tar.gz.

File metadata

File hashes

Hashes for livekit_plugins_namo_turn_detector-1.2.22.tar.gz
Algorithm Hash digest
SHA256 327583eb65503ad429bd010072c7df3756f57115223d5d01421d16e099379aaa
MD5 16a13498e9c62415e2fa9c6b71b15fe3
BLAKE2b-256 18147afc64c47db710360eda8106899be1be3027ac6cc530f060e940180b1565

See more details on using hashes here.

File details

Details for the file livekit_plugins_namo_turn_detector-1.2.22-py3-none-any.whl.

File metadata

File hashes

Hashes for livekit_plugins_namo_turn_detector-1.2.22-py3-none-any.whl
Algorithm Hash digest
SHA256 f62b5d4022aa24ccb4906f8873c06df05267226d2ad7d89a8903d15f9e367cb1
MD5 c15c83b631e6cd7539fe79af87a82dd4
BLAKE2b-256 6dd3906b98f9f4b4a4ba07aa42eced92ac103c4c7159a0089cd0dca473ef3c4a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page