Python input audio.
Project description
input_audio
Real-time audio input with voice activity detection and noise reduction for Python.
Features
- Voice Activity Detection: Automatically detects when speech starts and stops
- Noise Reduction: Built-in noise reduction for cleaner audio
- Real-time Processing: Low-latency audio capture and processing
- Flexible Output: Save to file or return as bytes
- Easy Integration: Simple API for quick implementation
Installation
pip install input-audio
Or install from source:
git clone https://github.com/allen2c/input_audio.git
cd input_audio
pip install -e .
Usage
Basic Usage
from input_audio import input_audio
# Capture audio with a prompt
audio_bytes = input_audio("Please speak:")
Save to File
from input_audio import input_audio
# Capture and save audio
audio_bytes = input_audio(
"Record your message:",
output_audio_filepath="recording.wav"
)
Advanced Configuration
from input_audio import input_audio
# Customize noise reduction and enable verbose output
audio_bytes = input_audio(
"Speak now:",
enable_noise_reduction=True,
noise_reduction_strength=0.8, # 0.0-1.0
verbose=True
)
Silent Capture
from input_audio import input_audio
# Capture without prompt
audio_bytes = input_audio()
API Reference
input_audio(prompt=None, *, output_audio_filepath=None, verbose=False, enable_noise_reduction=True, noise_reduction_strength=0.8)
Parameters:
prompt(str, optional): Text prompt to display before recordingoutput_audio_filepath(str/Path, optional): Path to save the audio fileverbose(bool): Enable detailed logging (default: False)enable_noise_reduction(bool): Apply noise reduction (default: True)noise_reduction_strength(float): Noise reduction intensity, 0.0-1.0 (default: 0.8)
Returns:
bytes: WAV audio data
Behavior:
- Automatically starts recording when speech is detected
- Stops recording after speech ends (with configurable buffer)
- Applies noise reduction and audio enhancement
- Returns high-quality 16kHz mono WAV audio
Requirements
- Python 3.11+
- PyAudio (microphone access)
- PyTorch (VAD model)
- Additional dependencies: see
requirements.txt
License
MIT License - see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
input_audio-0.1.0.tar.gz
(4.9 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file input_audio-0.1.0.tar.gz.
File metadata
- Download URL: input_audio-0.1.0.tar.gz
- Upload date:
- Size: 4.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.3 CPython/3.11.11 Darwin/24.5.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fe36102123ef35b82dff99ffc44204cc13f00c32ada3f1d4007120e62cc49221
|
|
| MD5 |
92f98a0c37c329d97f1cd23528609d01
|
|
| BLAKE2b-256 |
cc58c00acf0891090691c025144b5a670ed3c08a2fc0b07cd8802803b871b1bb
|
File details
Details for the file input_audio-0.1.0-py3-none-any.whl.
File metadata
- Download URL: input_audio-0.1.0-py3-none-any.whl
- Upload date:
- Size: 5.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.3 CPython/3.11.11 Darwin/24.5.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2ade9107b3aa057f7b84ed69e0fe8d477be113e12798e84fc37d00a039d7c68b
|
|
| MD5 |
edbce784ddcf80b527cb0c7995027e8c
|
|
| BLAKE2b-256 |
abe5608349bdf35e3801c21d12ad73b237376fd1cb04525e8e93401edd32d975
|