A high-performance Python client for FunASR WebSocket speech recognition service

These details have not been verified by PyPI

Project links

Project description

FunASR Python Client

A high-performance, enterprise-grade Python client for FunASR WebSocket speech recognition service. Built for production use with comprehensive error handling, automatic reconnection, and extensive customization options.

Features

🚀 High Performance

Asynchronous I/O: Built on asyncio for maximum concurrency
Connection Pooling: Efficient WebSocket connection management
Streaming Recognition: Real-time speech recognition with minimal latency
Memory Efficient: Optimized audio processing with configurable buffering

🔧 Production Ready

Robust Error Handling: Comprehensive exception handling and recovery
Automatic Reconnection: Smart reconnection with exponential backoff
Health Monitoring: Built-in connection health checks
Resource Management: Automatic cleanup and resource deallocation

📊 Recognition Modes for Different Scenarios

Offline Mode: Best for complete audio files, highest accuracy
Online Mode: Ultra-low latency streaming, suitable for interactive applications
Two-Pass Mode ⭐: Recommended for real-time scenarios - combines streaming speed with offline accuracy

🎯 Enterprise Features

Configuration Management: Flexible configuration with .env support
Comprehensive Logging: Structured logging with configurable levels
Metrics & Monitoring: Built-in performance metrics
Type Safety: Full type hints for better IDE support

🎵 Audio Processing

Multiple Formats: Support for WAV, FLAC, MP3, and more
Automatic Resampling: Smart audio format conversion
Voice Activity Detection: Optional VAD for improved efficiency
Microphone Integration: Real-time microphone recording support

Installation

Basic Installation

pip install funasr-python

With Optional Dependencies

# Audio processing capabilities
pip install funasr-python[audio]

# Performance optimizations
pip install funasr-python[performance]

# Development tools
pip install funasr-python[dev]

# Everything
pip install funasr-python[all]

From Source

git clone https://github.com/alibaba-damo-academy/FunASR.git
cd FunASR/clients/funasr-python
pip install -e .

Quick Start

Basic Usage

import asyncio
from funasr_client import AsyncFunASRClient

async def main():
    client = AsyncFunASRClient()

    # Recognize an audio file
    result = await client.recognize_file("examples/audio/asr_example.wav")
    print(f"Recognition result: {result.text}")

    await client.close()

if __name__ == "__main__":
    asyncio.run(main())

Real-time Recognition (Recommended)

For real-time applications, we recommend Two-Pass Mode which provides the best balance of speed and accuracy:

import asyncio
from funasr_client import AsyncFunASRClient
from funasr_client.models import RecognitionMode, ClientConfig

async def realtime_recognition():
    # Two-Pass Mode: Optimal for real-time scenarios
    config = ClientConfig(
        server_url="ws://localhost:10095",
        mode=RecognitionMode.TWO_PASS,  # Recommended for real-time
        enable_vad=True,  # Voice activity detection
        chunk_interval=10  # Balanced latency/accuracy
    )

    client = AsyncFunASRClient(config=config)

    def on_partial_result(result):
        print(f"Partial: {result.text}")

    def on_final_result(result):
        print(f"Final: {result.text} (confidence: {result.confidence:.2f})")

    from funasr_client.callbacks import SimpleCallback
    callback = SimpleCallback(
        on_partial=on_partial_result,
        on_final=on_final_result
    )

    await client.start()

    # Start real-time session
    session = await client.start_realtime(callback)

    # Your audio streaming logic here
    # In practice, you would stream from microphone or audio source

    await client.close()

if __name__ == "__main__":
    asyncio.run(realtime_recognition())

Ultra-Low Latency (Interactive Applications)

For scenarios requiring minimal latency (e.g., voice assistants):

async def ultra_low_latency():
    config = ClientConfig(
        mode=RecognitionMode.ONLINE,  # Ultra-low latency
        chunk_interval=5,  # Faster processing
        enable_vad=True
    )

    client = AsyncFunASRClient(config=config)
    # Implementation similar to above

Configuration with Environment Variables

Create a .env file:

FUNASR_WS_URL=ws://localhost:10095
FUNASR_MODE=2pass  # Recommended: Two-Pass Mode for optimal real-time performance
FUNASR_SAMPLE_RATE=16000
FUNASR_ENABLE_ITN=true
FUNASR_ENABLE_VAD=true  # Recommended for real-time scenarios

from funasr_client import create_async_client

# Configuration loaded automatically from .env
client = await create_async_client()
result = await client.recognize_file("examples/audio/asr_example.wav")
print(result.text)

Advanced Usage

Custom Configuration

from funasr_client import AsyncFunASRClient, ClientConfig, AudioConfig
from funasr_client.models import RecognitionMode, AudioFormat

config = ClientConfig(
    server_url="ws://your-server:10095",
    mode=RecognitionMode.TWO_PASS,
    timeout=30.0,
    max_retries=3,
    audio=AudioConfig(
        sample_rate=16000,
        format=AudioFormat.PCM,
        channels=1
    )
)

client = AsyncFunASRClient(config=config)

Callback Handlers

from funasr_client.callbacks import SimpleCallback

def on_result(result):
    print(f"Received: {result.text}")

def on_error(error):
    print(f"Error: {error}")

callback = SimpleCallback(
    on_result=on_result,
    on_error=on_error
)

client = AsyncFunASRClient(callback=callback)

Multiple Recognition Sessions

async def recognize_multiple():
    # Use Two-Pass Mode for optimal performance
    client = AsyncFunASRClient(
        mode=RecognitionMode.TWO_PASS  # ⭐ Recommended
    )

    # Process multiple files concurrently
    tasks = [
        client.recognize_file("examples/audio/asr_example.wav"),
        client.recognize_file("examples/audio/61-70970-0001.wav"),
        client.recognize_file("examples/audio/61-70970-0016.wav")
    ]

    results = await asyncio.gather(*tasks)
    for i, result in enumerate(results, 1):
        print(f"File {i}: {result.text}")

Real-time Applications Examples

Live Streaming Transcription

async def live_transcription():
    """Real-time transcription for live streams."""
    config = ClientConfig(
        mode=RecognitionMode.TWO_PASS,  # ⭐ Optimal for live streaming
        enable_vad=True,                # Filter silence
        chunk_interval=8,               # Balanced performance
        auto_reconnect=True             # Handle network issues
    )

    client = AsyncFunASRClient(config=config)

    def on_result(result):
        if result.is_final:
            # Send to subtitle system
            send_subtitle(result.text, result.confidence)
        else:
            # Show live preview
            show_live_text(result.text)

    from funasr_client.callbacks import SimpleCallback
    callback = SimpleCallback(on_final=on_result, on_partial=on_result)

    await client.start()
    session = await client.start_realtime(callback)

    # Your audio streaming implementation here
    await stream_audio_to_session(session)

Voice Assistant Integration

async def voice_assistant():
    """Voice assistant with Two-Pass optimization."""
    config = ClientConfig(
        mode=RecognitionMode.TWO_PASS,  # ⭐ Best for voice assistants
        enable_vad=True,                # Automatic speech detection
        chunk_interval=10               # Good responsiveness
    )

    client = AsyncFunASRClient(config=config)

    async def process_command(result):
        if result.is_final and result.confidence > 0.8:
            # Process voice command
            response = await process_voice_command(result.text)
            await speak_response(response)

    from funasr_client.callbacks import AsyncSimpleCallback
    callback = AsyncSimpleCallback(on_final=process_command)

    await client.start()
    session = await client.start_realtime(callback)

    print("🎤 Voice assistant ready. Speak now...")
    # Your microphone streaming logic here

Command Line Interface

The package includes a full-featured CLI:

# Basic recognition
funasr-client recognize examples/audio/asr_example.wav

# Real-time recognition from microphone
funasr-client stream --source microphone

# Batch processing
funasr-client batch examples/audio/*.wav --output results.jsonl

# Server configuration
funasr-client configure --server-url ws://localhost:10095

# Test connection
funasr-client test-connection

Recognition Mode Selection Guide

Choose the optimal recognition mode for your use case:

Mode	Latency	Accuracy	Best For	Use Cases
Two-Pass ⭐	Medium	High	Real-time applications	Live streaming, real-time subtitles, voice assistants
Online	Low	Medium	Interactive apps	Voice commands, quick responses
Offline	High	Highest	File processing	Transcription services, post-processing

Two-Pass Mode Advantages ⭐

Recommended for real-time scenarios because it:

✅ Fast partial results for immediate user feedback
✅ High-accuracy final results using 2-pass optimization
✅ Balanced resource usage with smart buffering
✅ Production-ready with robust error handling

# Recommended configuration for real-time applications
config = ClientConfig(
    mode=RecognitionMode.TWO_PASS,  # Best balance
    enable_vad=True,                # Improves efficiency
    chunk_interval=10,              # Optimal for most cases
    auto_reconnect=True             # Production reliability
)

Configuration

Environment Variables

Variable	Description	Default
`FUNASR_WS_URL`	WebSocket server URL	`ws://localhost:10095`
`FUNASR_MODE`	Recognition mode (`offline`, `online`, `2pass`)	`2pass` ⭐
`FUNASR_TIMEOUT`	Connection timeout	`30.0`
`FUNASR_MAX_RETRIES`	Max retry attempts	`3`
`FUNASR_SAMPLE_RATE`	Audio sample rate	`16000`
`FUNASR_ENABLE_ITN`	Enable inverse text normalization	`true`
`FUNASR_ENABLE_VAD`	Enable voice activity detection	`true`
`FUNASR_DEBUG`	Enable debug logging	`false`

💡 Tip: Two-Pass Mode (2pass) is recommended for most real-time applications as it provides the best balance between latency and accuracy.

Configuration File

from funasr_client import ConfigManager

# Load from custom config file
config = ConfigManager.from_file("my_config.json")
client = AsyncFunASRClient(config=config.client_config)

Error Handling

from funasr_client.errors import (
    FunASRError,
    ConnectionError,
    AudioError,
    TimeoutError
)

try:
    result = await client.recognize_file("examples/audio/asr_example.wav")
except ConnectionError:
    print("Failed to connect to server")
except AudioError:
    print("Audio processing failed")
except TimeoutError:
    print("Request timed out")
except FunASRError as e:
    print(f"Recognition error: {e}")

Performance Optimization

Real-time Performance Best Practices

For optimal real-time performance, follow these recommendations:

from funasr_client import AsyncFunASRClient, ClientConfig
from funasr_client.models import RecognitionMode, AudioConfig

# Optimized configuration for real-time scenarios
config = ClientConfig(
    # Core settings
    mode=RecognitionMode.TWO_PASS,  # ⭐ Best balance for real-time
    enable_vad=True,                # Reduces processing load
    chunk_interval=10,              # Optimal latency/accuracy trade-off

    # Performance settings
    auto_reconnect=True,            # Production reliability
    connection_pool_size=5,         # Connection reuse
    buffer_size=8192,               # Optimal buffer size

    # Audio optimization
    audio=AudioConfig(
        sample_rate=16000,          # Standard ASR rate
        channels=1,                 # Mono for efficiency
        sample_width=2              # 16-bit PCM
    )
)

client = AsyncFunASRClient(config=config)

Performance Tuning Guidelines

Parameter	Recommended Value	Impact
`mode`	`TWO_PASS` ⭐	Best accuracy/latency balance
`chunk_interval`	`10`	Standard real-time performance
`chunk_interval`	`5`	Lower latency, higher CPU usage
`chunk_interval`	`20`	Higher latency, lower CPU usage
`enable_vad`	`True`	Reduces unnecessary processing
`sample_rate`	`16000`	Optimal for most ASR models

Connection Pooling

from funasr_client import ConnectionManager

# Use connection manager for multiple clients
manager = ConnectionManager(max_connections=10)
client1 = AsyncFunASRClient(connection_manager=manager)
client2 = AsyncFunASRClient(connection_manager=manager)

Audio Processing

from funasr_client import AudioProcessor

# Pre-process audio for better performance
processor = AudioProcessor(
    target_sample_rate=16000,
    enable_vad=True,
    chunk_size=1024
)

processed_audio = processor.process_file("examples/audio/asr_example.wav")
result = await client.recognize_audio(processed_audio)

Testing

Run the test suite:

# Install test dependencies
pip install funasr-python[test]

# Run all tests
pytest

# Run with coverage
pytest --cov=funasr_client

# Run specific test categories
pytest -m unit
pytest -m integration

Development

Setup Development Environment

git clone https://github.com/alibaba-damo-academy/FunASR.git
cd FunASR/clients/funasr-python

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install in development mode
pip install -e .[dev]

# Install pre-commit hooks
pre-commit install

Code Quality

# Format code
ruff format src/ tests/

# Lint code
ruff check src/ tests/

# Type check
mypy src/

# Run all quality checks
pre-commit run --all-files

API Reference

Core Classes

AsyncFunASRClient: Main asynchronous client
FunASRClient: Synchronous client wrapper
ClientConfig: Client configuration
AudioConfig: Audio processing configuration
RecognitionResult: Recognition result container

Callback System

RecognitionCallback: Abstract callback interface
SimpleCallback: Basic callback implementation
LoggingCallback: Logging-based callback
MultiCallback: Combines multiple callbacks

Audio Processing

AudioProcessor: Audio processing utilities
AudioRecorder: Microphone recording
AudioFileStreamer: File-based audio streaming

Utilities

ConfigManager: Configuration management
ConnectionManager: Connection pooling
Timer: Performance timing utilities

Contributing

We welcome contributions! Please see CONTRIBUTING.md for details.

Development Process

Fork the repository
Create a feature branch
Make your changes
Add tests for new functionality
Run the test suite
Submit a pull request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Changelog

See CHANGELOG.md for version history.

Support

Documentation: FunASR Documentation
Issues: GitHub Issues
Discussions: GitHub Discussions

Acknowledgments

Built on the excellent FunASR speech recognition toolkit
Inspired by best practices from the Python asyncio ecosystem
Thanks to all contributors and users for feedback and improvements

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.5

Oct 29, 2025

0.1.4

Oct 29, 2025

0.1.2

Oct 28, 2025

This version

0.1.0

Sep 26, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

funasr_python-0.1.0.tar.gz (320.1 kB view details)

Uploaded Sep 26, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

funasr_python-0.1.0-py3-none-any.whl (50.7 kB view details)

Uploaded Sep 26, 2025 Python 3

File details

Details for the file funasr_python-0.1.0.tar.gz.

File metadata

Download URL: funasr_python-0.1.0.tar.gz
Upload date: Sep 26, 2025
Size: 320.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for funasr_python-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`25d39b8da535395d4e9a11c9d3b6ade791bab22883ec71ae327497c622be09cd`
MD5	`dde3c70842c98b5ad40955e65beb92bb`
BLAKE2b-256	`9b19641c5d67a62fb379ab9f85bed80867ce82261516d28cf80f92dc99c05436`

See more details on using hashes here.

File details

Details for the file funasr_python-0.1.0-py3-none-any.whl.

File metadata

Download URL: funasr_python-0.1.0-py3-none-any.whl
Upload date: Sep 26, 2025
Size: 50.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for funasr_python-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`46cc782fd260b5929a9a680a9bafbb1d5e799916eda473957173afc927a66559`
MD5	`a9b1353af31fbd0d0da24352da303300`
BLAKE2b-256	`0600494dbd6b701a8ec0a9909e8a4ee84f79737a071644df597a3f23c447b141`

See more details on using hashes here.

funasr-python 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

FunASR Python Client

Features

🚀 High Performance

🔧 Production Ready

📊 Recognition Modes for Different Scenarios

🎯 Enterprise Features

🎵 Audio Processing

Installation

Basic Installation

With Optional Dependencies

From Source

Quick Start

Basic Usage

Real-time Recognition (Recommended)

Ultra-Low Latency (Interactive Applications)

Configuration with Environment Variables

Advanced Usage

Custom Configuration

Callback Handlers

Multiple Recognition Sessions

Real-time Applications Examples

Live Streaming Transcription

Voice Assistant Integration

Command Line Interface

Recognition Mode Selection Guide

Two-Pass Mode Advantages ⭐

Configuration

Environment Variables

Configuration File

Error Handling

Performance Optimization

Real-time Performance Best Practices

Performance Tuning Guidelines

Connection Pooling

Audio Processing

Testing

Development

Setup Development Environment

Code Quality

API Reference

Core Classes

Callback System

Audio Processing

Utilities

Contributing

Development Process

License

Changelog

Support

Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes