Add your description here

Project description

Microphone Stream Utility

A Python utility for managing microphone streams with support for both manual reading and callback-based processing, plus optional Voice Activity Detection (VAD).

Features

Multi-process audio capture: Audio is captured in a separate process to avoid blocking the main thread
Shared memory buffer: Efficient data transfer between processes using shared memory
Flexible audio configuration: Configurable sample rate, channels, data type, and buffer settings
Callback support: Process audio data automatically in a separate thread
Manual reading: Traditional read-based approach for custom processing
Device management: Automatic device detection and selection
Context manager support: Easy stream lifecycle management
Voice Activity Detection (VAD): Optional speech detection using Silero VAD (requires additional dependencies)

Installation

Basic Installation (Core Features Only)

# Clone the repository
git clone <repository-url>
cd mic-stream-util

# Install core dependencies only
uv sync

With Voice Activity Detection (VAD)

# Install with VAD support (includes torch and silero-vad)
uv add mic-stream-util[vad]

# Or if installing from source
uv sync --extra vad

All Features

# Install with all optional features
uv add mic-stream-util[all]

Quick Start

Basic Usage (Manual Reading)

from mic_stream_util.core.microphone_manager import MicrophoneStream
from mic_stream_util.core.audio_config import AudioConfig
import numpy as np

# Create configuration
config = AudioConfig(
    sample_rate=16000,
    channels=1,
    dtype="float32",
    num_samples=1024
)

# Create and use microphone stream
mic_stream = MicrophoneStream(config)

with mic_stream.stream():
    while True:
        # Read audio data manually
        audio_data = mic_stream.read()
        print(f"Audio shape: {audio_data.shape}")
        # Process audio_data as needed

Callback Mode

from mic_stream_util.core.microphone_manager import MicrophoneStream
from mic_stream_util.core.audio_config import AudioConfig
import numpy as np

def audio_callback(audio_data: np.ndarray) -> None:
    """Process audio data automatically."""
    rms = np.sqrt(np.mean(audio_data**2))
    print(f"Audio level: {rms:.4f}")

# Create configuration
config = AudioConfig(
    sample_rate=16000,
    channels=1,
    dtype="float32",
    num_samples=1024
)

# Create microphone stream
mic_stream = MicrophoneStream(config)

# Set callback function
mic_stream.set_callback(audio_callback)

# Start streaming - callback will be called automatically
with mic_stream.stream():
    # Keep main thread alive
    import time
    while True:
        time.sleep(0.1)

Voice Activity Detection (VAD)

from mic_stream_util import SpeechManager, VADConfig, AudioConfig

# Check if VAD is available
from mic_stream_util import VAD_AVAILABLE
if not VAD_AVAILABLE:
    print("VAD requires additional dependencies. Install with: pip install mic-stream-util[vad]")
    exit(1)

# Create configurations
audio_config = AudioConfig(sample_rate=16000, dtype="float32", num_samples=512)
vad_config = VADConfig(threshold=0.5, padding_before_ms=300, padding_after_ms=300)

# Create speech manager
speech_manager = SpeechManager(audio_config=audio_config, vad_config=vad_config)

def on_speech_start(timestamp: float):
    print(f"Speech started at {timestamp:.2f}s")

def on_speech_ended(speech_chunk):
    print(f"Speech ended, duration: {speech_chunk.duration:.2f}s")

# Set callbacks
speech_manager.set_callbacks(
    on_speech_start=on_speech_start,
    on_speech_ended=on_speech_ended
)

# Start VAD
with speech_manager.stream_context():
    import time
    while True:
        time.sleep(0.1)

Command Line Interface

The package includes a CLI with various commands:

# List audio devices
mic devices

# Monitor audio levels
mic monitor

# Record audio
mic record --output recording.wav

# Voice Activity Detection (requires VAD dependencies)
mic vad --threshold 0.5

# Test latency
mic latency-test

# CPU usage monitoring
mic cpu-usage

API Reference

Core Classes

MicrophoneStream

Main class for managing microphone streams.

Constructor

MicrophoneStream(config: AudioConfig | None = None)

config: Audio configuration. If None, uses default configuration.

Methods

`set_callback(callback: Callable[[np.ndarray], None] | None)`

Set a callback function to be called when audio data is available.

callback: Function that accepts a numpy array with shape (num_samples, channels)
If None, callback mode is disabled

`clear_callback()`

Clear the callback function and disable callback mode.

`has_callback() -> bool`

Check if a callback function is set.

`start_stream()`

Start the microphone stream in a separate process.

`stop_stream()`

Stop the microphone stream and clean up resources.

`stream()`

Context manager for automatic stream start/stop.

`is_streaming() -> bool`

Check if the stream is currently active.

`read_raw(num_samples: int) -> bytes`

Read raw audio data from the stream buffer.

Note: This method is disabled when callback mode is active.

`read(num_samples: int | None = None) -> np.ndarray`

Read audio data from the stream buffer.

Note: This method is disabled when callback mode is active.

AudioConfig

Configuration class for audio settings.

Constructor

AudioConfig(
    sample_rate: int = 16000,
    channels: int = 1,
    dtype: str = "float32",
    blocksize: int = None,
    buffer_size: int | None = None,
    device: int | None = None,
    device_name: str | None = None,
    latency: str = "low",
    num_samples: int = 512
)

Parameters

sample_rate: Sample rate in Hz
channels: Number of audio channels
dtype: Data type ("float32", "int32", "int16", "int8", "uint8")
blocksize: Audio block size (defaults to sample_rate // 10)
buffer_size: Buffer size in samples (defaults to sample_rate * 10)
device: Device index
device_name: Device name (will be used to find device index)
latency: Latency setting ("low" or "high")
num_samples: Number of samples to process at a time

Speech Classes (VAD Dependencies Required)

SpeechManager

Main class for Voice Activity Detection.

Constructor

SpeechManager(audio_config: AudioConfig, vad_config: VADConfig)

VADConfig

Configuration for Voice Activity Detection.

VADConfig(
    threshold: float = 0.5,
    padding_before_ms: int = 300,
    padding_after_ms: int = 300,
    max_silence_ms: int = 1000,
    min_speech_duration_ms: int = 250,
    max_speech_duration_s: float = float("inf")
)

Examples

See the example files for complete demonstrations:

example_usage.py - Basic microphone usage
example_callback_usage.py - Callback-based processing
example_speech_usage.py - Voice Activity Detection

Important Notes

Optional Dependencies

Core functionality: Works without any additional dependencies
VAD functionality: Requires torch and silero-vad (install with [vad] extra)
Check availability: Use from mic_stream_util import VAD_AVAILABLE to check if VAD is available

Callback Mode vs Manual Reading

Callback Mode: Audio data is automatically processed in a separate thread. The read() and read_raw() methods are disabled.
Manual Reading: You must manually call read() or read_raw() to get audio data.

Thread Safety

Callback functions are called in a separate thread, so ensure thread-safe operations
The callback function should handle exceptions gracefully as they won't stop the stream

Resource Management

Always use the context manager (with mic_stream.stream():) or call stop_stream() to clean up resources
The stream uses shared memory, so proper cleanup is important

Development

# Run tests
uv run pytest

# Run example
uv run example_callback_usage.py

# Install development dependencies
uv sync --extra vad

Project details

Release history Release notifications | RSS feed

0.2.3

Jan 26, 2026

0.2.2

Jan 13, 2026

0.2.1

Nov 27, 2025

This version

0.2.0

Jul 24, 2025

0.1.0

Jul 17, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mic_stream_util-0.2.0.tar.gz (122.0 kB view details)

Uploaded Jul 24, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mic_stream_util-0.2.0-py3-none-any.whl (37.4 kB view details)

Uploaded Jul 24, 2025 Python 3

File details

Details for the file mic_stream_util-0.2.0.tar.gz.

File metadata

Download URL: mic_stream_util-0.2.0.tar.gz
Upload date: Jul 24, 2025
Size: 122.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.9.23

File hashes

Hashes for mic_stream_util-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`c26799b6f9ac51c5e049a8941b82610572b945131eb0abc4ba2e10cbaf55f352`
MD5	`1de1bbfef7245408d14b9dd65092c08f`
BLAKE2b-256	`a359b4e3988199304961a8b57d551d4747a1a021ffcd01e28e80ea1ca6d20c3d`

See more details on using hashes here.

File details

Details for the file mic_stream_util-0.2.0-py3-none-any.whl.

File metadata

Download URL: mic_stream_util-0.2.0-py3-none-any.whl
Upload date: Jul 24, 2025
Size: 37.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.9.23

File hashes

Hashes for mic_stream_util-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`90734439449f2860eb8230fa56b104ea633fa95fb1819500fe962ff3cdc86501`
MD5	`304c746b9cd93082a8389f7bf4a41d62`
BLAKE2b-256	`69dbb512edd544e2bfe2e1320752f751882c963cd601e4885c25eed6e68b1178`

See more details on using hashes here.

mic-stream-util 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Microphone Stream Utility

Features

Installation

Basic Installation (Core Features Only)

With Voice Activity Detection (VAD)

All Features

Quick Start

Basic Usage (Manual Reading)

Callback Mode

Voice Activity Detection (VAD)

Command Line Interface

API Reference

Core Classes

MicrophoneStream

Constructor

Methods

set_callback(callback: Callable[[np.ndarray], None] | None)

clear_callback()

has_callback() -> bool

start_stream()

stop_stream()

stream()

is_streaming() -> bool

read_raw(num_samples: int) -> bytes

read(num_samples: int | None = None) -> np.ndarray

AudioConfig

Constructor

Parameters

Speech Classes (VAD Dependencies Required)

SpeechManager

Constructor

VADConfig

Examples

Important Notes

Optional Dependencies

Callback Mode vs Manual Reading

Thread Safety

Resource Management

Development

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`set_callback(callback: Callable[[np.ndarray], None] | None)`

`clear_callback()`

`has_callback() -> bool`

`start_stream()`

`stop_stream()`

`stream()`

`is_streaming() -> bool`

`read_raw(num_samples: int) -> bytes`

`read(num_samples: int | None = None) -> np.ndarray`