Python bindings for the ai-coustics speech-enhancement SDK
Project description
ai-coustics SDK for Python (aic)
This repository provides prebuilt Python wheels for the ai-coustics real-time audio enhancement SDK, compatible with a variety of platforms and Python versions. The SDK offers state-of-the-art neural network-based audio enhancement for speech processing applications.
🚀 Features
- Real-time audio enhancement using advanced neural networks
- Multiple model variants: QUAIL_L, QUAIL_S, QUAIL_XS for different performance/quality trade-offs
- Low latency processing optimized for streaming applications
- Cross-platform support: Linux, macOS, Windows
- Context manager support for automatic resource management
📦 Installation
Prerequisites
- Python 3.9 or higher
- GLIBC >= 2.28 on Linux
Install the SDK
pip install aic-sdk
For Development/Examples
To run the examples, install additional dependencies:
pip install -r examples/requirements.txt
🔑 License Key Setup
The SDK requires a license key for full functionality.
- Get a license key from ai-coustics
- Set an environment variable (or a
.envfile):export AICOUSTICS_API_KEY="your_license_key_here"
Or in a.envfile:AICOUSTICS_API_KEY=your_license_key_here - Pass the key to the model (the SDK does not read env vars automatically):
import os from dotenv import load_dotenv from aic import Model, AICModelType load_dotenv() # loads .env if present license_key = os.getenv("AICOUSTICS_API_KEY") with Model(AICModelType.QUAIL_L, license_key=license_key) as model: model.initialize(sample_rate=48000, channels=1, frames=480) # ...
🎯 Quick Start
Basic Audio Enhancement
import os
import numpy as np
from dotenv import load_dotenv
from aic import Model, AICModelType, AICParameter
load_dotenv()
license_key = os.getenv("AICOUSTICS_API_KEY")
# Create model instance
model = Model(
model_type=AICModelType.QUAIL_L,
license_key=license_key, # pass the key from env (empty = trial)
)
# Initialize for 48kHz mono audio with 480-frame buffers
model.initialize(sample_rate=48000, channels=1, frames=480)
# Set enhancement strength (0.0 to 1.0)
model.set_parameter(AICParameter.ENHANCEMENT_LEVEL, 0.8)
# Process audio (planar format: [channels, frames])
audio_input = np.random.randn(1, 480).astype(np.float32)
enhanced_audio = model.process(audio_input)
# Clean up
model.close()
Using Context Manager (Recommended)
import os
import numpy as np
from dotenv import load_dotenv
from aic import Model, AICModelType
load_dotenv()
license_key = os.getenv("AICOUSTICS_API_KEY", "")
with Model(AICModelType.QUAIL_L, license_key=license_key) as model:
model.initialize(sample_rate=48000, channels=1, frames=480)
# Process audio in chunks
audio_chunk = np.random.randn(1, 480).astype(np.float32)
enhanced = model.process(audio_chunk)
# Model automatically closed when exiting context
📁 Example: Enhance WAV File
The repository includes a complete example for processing WAV files:
python examples/enhance.py input.wav output.wav --strength 80
Example Usage
import librosa
import soundfile as sf
from aic import Model, AICModelType, AICParameter
def enhance_wav_file(input_path, output_path, strength=80):
# Load audio
audio, sample_rate = librosa.load(input_path, sr=48000, mono=True)
audio = audio.reshape(1, -1) # Convert to planar format
# Create model
from dotenv import load_dotenv
import os
load_dotenv()
license_key = os.getenv("AICOUSTICS_API_KEY")
with Model(AICModelType.QUAIL_L, license_key=license_key) as model:
model.initialize(sample_rate=48000, channels=1, frames=480)
model.set_parameter(AICParameter.ENHANCEMENT_LEVEL, strength / 100)
# Process in chunks
chunk_size = 480
output = np.zeros_like(audio)
for i in range(0, audio.shape[1], chunk_size):
chunk = audio[:, i:i + chunk_size]
# Pad last chunk if needed
if chunk.shape[1] < chunk_size:
padded = np.zeros((1, chunk_size), dtype=audio.dtype)
padded[:, :chunk.shape[1]] = chunk
chunk = padded
enhanced_chunk = model.process(chunk)
output[:, i:i + chunk_size] = enhanced_chunk[:, :chunk.shape[1]]
# Save result
sf.write(output_path, output.T, sample_rate)
🔧 API Reference
For the complete, up-to-date API documentation (including class/method docs and enums), see the published site:
🎵 Audio Format Requirements
- Sample Rate: 48kHz recommended (optimal for all models)
- Format: Float32 in linear -1.0 to +1.0 range
- Layout:
- Planar:
(channels, frames)- useprocess() - Interleaved:
(frames,)- useprocess_interleaved()
- Planar:
- Channels: Mono (1) or stereo (2) supported
🔄 Processing Patterns
Real-time Streaming
with Model(AICModelType.QUAIL_S) as model:
model.initialize(sample_rate=48000, channels=1, frames=480)
while audio_stream.has_data():
chunk = audio_stream.get_chunk(480) # Get 480 frames
enhanced = model.process(chunk)
audio_output.play(enhanced)
Batch Processing
with Model(AICModelType.QUAIL_L) as model:
model.initialize(sample_rate=48000, channels=1, frames=480)
for audio_file in audio_files:
audio = load_audio(audio_file)
enhanced = process_in_chunks(model, audio)
save_audio(enhanced, f"enhanced_{audio_file}")
🐛 Troubleshooting
Common Issues
- "GLIBC": On Linux you need to have GLIBC >= 2.28
- "Array shape error": Ensure audio is in correct format (planar or interleaved)
- "Sample rate mismatch": Use 48kHz for optimal performance
Performance Tips
- Use
QUAIL_XSfor applications that need lower latency - Process in chunks of
optimal_num_frames()size - Use context manager for automatic cleanup
- Pre-allocate output arrays to avoid memory allocation
| Component | License | File |
|---|---|---|
Python wrapper (aic/*.py) |
Apache-2.0 | LICENSE |
Native SDK binaries (aic/libs/*) |
Proprietary, all rights reserved | LICENSE.AIC-SDK |
🤝 Support
- Documentation: ai-coustics.com
- Issues: Report bugs and feature requests via GitHub issues
🔗 Related
- ai-coustics Website
- Documentation: Will be published via GitHub Pages. Until then, you can build and view locally:
- Install docs deps:
pip install mkdocs mkdocs-material mkdocstrings-python - Serve locally:
mkdocs serve - Build static site:
mkdocs build
- Install docs deps:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file aic_sdk-0.6.1.tar.gz.
File metadata
- Download URL: aic_sdk-0.6.1.tar.gz
- Upload date:
- Size: 29.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9b4a48e0dcdb3ad0ef702c64b5930c5ce1c34e11235861b3ba4a8aaa337bb777
|
|
| MD5 |
2a02660a1baca07398420d1531e844df
|
|
| BLAKE2b-256 |
8a40a307063543a59be1ebec640027666d1180ccf3434f69d890e33f55f78066
|
Provenance
The following attestation bundles were made for aic_sdk-0.6.1.tar.gz:
Publisher:
build-and-publish.yml on ai-coustics/aic-sdk-py
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
aic_sdk-0.6.1.tar.gz -
Subject digest:
9b4a48e0dcdb3ad0ef702c64b5930c5ce1c34e11235861b3ba4a8aaa337bb777 - Sigstore transparency entry: 407395013
- Sigstore integration time:
-
Permalink:
ai-coustics/aic-sdk-py@68633bb5c8f0d9c35cf949eae4cd8aac1c3ffeaa -
Branch / Tag:
refs/tags/0.6.1 - Owner: https://github.com/ai-coustics
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build-and-publish.yml@68633bb5c8f0d9c35cf949eae4cd8aac1c3ffeaa -
Trigger Event:
push
-
Statement type: