Skip to main content

Python bindings for the ai-coustics speech-enhancement SDK

Project description

ai-coustics SDK for Python (aic)

Integration Tests Deploy Docs

Python 3.10+

This repository provides prebuilt Python wheels for the ai-coustics real-time audio enhancement SDK, compatible with a variety of platforms and Python versions. The SDK offers state-of-the-art neural network-based audio enhancement for speech processing applications.

🚀 Features

  • Real-time audio enhancement using advanced neural networks
  • Multiple model variants: QUAIL_L, QUAIL_S, QUAIL_XS for different performance/quality trade-offs
  • Low latency processing optimized for streaming applications
  • Cross-platform support: Linux, macOS, Windows
  • Context manager support for automatic resource management

📦 Installation

Prerequisites

  • Python 3.9 or higher
  • GLIBC >= 2.28 on Linux

Install the SDK

pip install aic-sdk

For Development/Examples

To run the examples, install additional dependencies:

pip install -r examples/requirements.txt

🔑 License Key Setup

The SDK requires a license key for full functionality.

  1. Get a license key from ai-coustics
  2. Set an environment variable (or a .env file):
    export AIC_SDK_LICENSE="your_license_key_here"
    
    Or in a .env file:
    AIC_SDK_LICENSE=your_license_key_here
    
  3. Pass the key to the model (the SDK does not read env vars automatically):
    import os
    from dotenv import load_dotenv
    from aic import Model, AICModelType
    
    load_dotenv()  # loads .env if present
    license_key = os.getenv("AIC_SDK_LICENSE")
    
    with Model(AICModelType.QUAIL_L, license_key=license_key, sample_rate=48000, channels=1, frames=480) as model:
        # ...
    

🎯 Quick Start

Basic Audio Enhancement

import os
import numpy as np
from dotenv import load_dotenv
from aic import Model, AICModelType, AICParameter

load_dotenv()
license_key = os.getenv("AIC_SDK_LICENSE")

# Create model instance
model = Model(
    model_type=AICModelType.QUAIL_L,
    license_key=license_key,   # pass the key from env (empty = trial)
    sample_rate=48000,
    channels=1,
    frames=480,
)

# Set enhancement strength (0.0 to 1.0)
model.set_parameter(AICParameter.ENHANCEMENT_LEVEL, 0.8)

# Process audio (planar format: [channels, frames])
audio_input = np.random.randn(1, 480).astype(np.float32)
enhanced_audio = model.process(audio_input)

# Clean up
model.close()

Using Context Manager (Recommended)

import os
import numpy as np
from dotenv import load_dotenv
from aic import Model, AICModelType

load_dotenv()
license_key = os.getenv("AIC_SDK_LICENSE", "")

with Model(AICModelType.QUAIL_L, license_key=license_key, sample_rate=48000, channels=1, frames=480) as model:
    # Process audio in chunks
    audio_chunk = np.random.randn(1, 480).astype(np.float32)
    enhanced = model.process(audio_chunk)
    # Model automatically closed when exiting context

📁 Example: Enhance WAV File

The repository includes a complete example for processing WAV files:

python examples/enhance.py input.wav output.wav --strength 80

Example Usage

import librosa
import soundfile as sf
from aic import Model, AICModelType, AICParameter

def enhance_wav_file(input_path, output_path, strength=80):
    # Load audio
    audio, sample_rate = librosa.load(input_path, sr=48000, mono=True)
    audio = audio.reshape(1, -1)  # Convert to planar format
    
    # Create model
    from dotenv import load_dotenv
    import os

    load_dotenv()
    license_key = os.getenv("AIC_SDK_LICENSE")

    with Model(AICModelType.QUAIL_L, license_key=license_key, sample_rate=48000, channels=1, frames=480) as model:
        model.set_parameter(AICParameter.ENHANCEMENT_LEVEL, strength / 100)
        
        # Process in chunks
        chunk_size = 480
        output = np.zeros_like(audio)
        
        for i in range(0, audio.shape[1], chunk_size):
            chunk = audio[:, i:i + chunk_size]
            # Pad last chunk if needed
            if chunk.shape[1] < chunk_size:
                padded = np.zeros((1, chunk_size), dtype=audio.dtype)
                padded[:, :chunk.shape[1]] = chunk
                chunk = padded
            
            enhanced_chunk = model.process(chunk)
            output[:, i:i + chunk_size] = enhanced_chunk[:, :chunk.shape[1]]
    
    # Save result
    sf.write(output_path, output.T, sample_rate)

🔧 API Reference

For the complete, up-to-date API documentation (including class/method docs and enums), see the published site:

🎵 Audio Format Requirements

  • Sample Rate: 8/16/48 kHz recommended
  • Format: Float32 in linear -1.0 to +1.0 range
  • Layout:
    • Planar: (channels, frames) - use process()
    • Interleaved: (frames,) - use process_interleaved()
  • Channels: Mono (1) or stereo (2) supported

🔄 Processing Patterns

Real-time Streaming

with Model(AICModelType.QUAIL_S, sample_rate=48000, channels=1, frames=480) as model:
    
    while audio_stream.has_data():
        chunk = audio_stream.get_chunk(480)  # Get 480 frames
        enhanced = model.process(chunk)
        audio_output.play(enhanced)

Batch Processing

with Model(AICModelType.QUAIL_L, sample_rate=48000, channels=1, frames=480) as model:
    
    for audio_file in audio_files:
        audio = load_audio(audio_file)
        enhanced = process_in_chunks(model, audio)
        save_audio(enhanced, f"enhanced_{audio_file}")

🧑‍💻 Development

Setup

python -m venv .venv
source .venv/bin/activate  # Windows: .venv\\Scripts\\activate
pip install -r requirements-dev.txt  # includes editable install (-e .)

Pre-commit hooks (Ruff)

pre-commit install
pre-commit run --all-files

This runs Ruff linting and formatting on commit. You can also run Ruff manually:

ruff check . --fix
ruff format .

Running tests

  • Unit tests (no native SDK required):
pytest -q
  • Integration tests (real SDK + license required):
export AIC_SDK_LICENSE="your_key"  # or use a .env file
pytest -q integration_tests

Note: To run against the real native SDK locally, it is simplest to install the package non-editable so the platform binaries are bundled:

pip uninstall -y aic-sdk
pip install .

Editable installs (-e .) do not place native binaries into the source tree. The unit test suite does not need them; the integration suite does.

Docs (MkDocs)

mkdocs serve   # live-reload docs at http://127.0.0.1:8000
# or
mkdocs build

Versioning

  • Python wrapper version: pyproject.toml[project].version
  • C SDK binary version: pyproject.toml[tool.aic-sdk].sdk-version

The Python version and the underlying C SDK version are intentionally decoupled. The build step downloads platform binaries named from sdk-version.

🐛 Troubleshooting

Common Issues

  1. "GLIBC": On Linux you need to have GLIBC >= 2.28
  2. "Array shape error": Ensure audio is in correct format (planar or interleaved)
  3. "Sample rate mismatch": Use 48kHz for optimal performance

Performance Tips

  • Use QUAIL_XS for applications that need lower latency
  • Process in chunks of optimal_num_frames() size
  • Use context manager for automatic cleanup
  • Pre-allocate output arrays to avoid memory allocation
Component License File
Python wrapper (aic/*.py) Apache-2.0 LICENSE
Native SDK binaries (aic/libs/*) Proprietary, all rights reserved LICENSE.AIC-SDK

🤝 Support

  • Documentation: ai-coustics.com
  • Issues: Report bugs and feature requests via GitHub issues

🔗 Related

  • ai-coustics Website
  • Documentation: Will be published via GitHub Pages. Until then, you can build and view locally:
    • Install docs deps: pip install mkdocs mkdocs-material mkdocstrings-python
    • Serve locally: mkdocs serve
    • Build static site: mkdocs build

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aic_sdk-1.2.0.tar.gz (35.6 kB view details)

Uploaded Source

File details

Details for the file aic_sdk-1.2.0.tar.gz.

File metadata

  • Download URL: aic_sdk-1.2.0.tar.gz
  • Upload date:
  • Size: 35.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for aic_sdk-1.2.0.tar.gz
Algorithm Hash digest
SHA256 eeda9a181c679f175dbe6f0efc0c67ec98ff3d84cfe01541fef7fa12ecd505ca
MD5 b4a7e2b33f66019b57206bf9d63d111a
BLAKE2b-256 f9ba3ebe31b91e03d42437ec864e9d2af3a52b7ccc73a1a0c1026275956270b0

See more details on using hashes here.

Provenance

The following attestation bundles were made for aic_sdk-1.2.0.tar.gz:

Publisher: build-and-publish.yml on ai-coustics/aic-sdk-py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page