Audio Utils
Project description
ix
Audio Utils
To install: pip install ix
Description
The ix package provides a collection of utilities for audio processing, particularly focusing on voice and music analysis and synthesis. It includes functions for spectral analysis, voice activity detection, formant estimation, and audio compression using various techniques like LPC (Linear Predictive Coding), MDCT (Modified Discrete Cosine Transform), and more.
Main Features
- Voice Activity Detection (VAD): Detects voiced and unvoiced segments in an audio signal.
- Spectral Analysis: Functions for computing the Short-Time Fourier Transform (STFT) and its inverse, as well as other spectral transformations.
- Formant Analysis: Estimation of formant frequencies using LPC coefficients.
- Audio Compression: Implements DCT-based audio compression and decompression.
- Pitch Tracking: Implements the Harvest algorithm for robust pitch tracking.
- Feature Extraction: Converts speech waveforms to mel-generalized cepstral coefficients (MGC), which are useful in voice synthesis.
- Audio Synthesis: Reconstructs audio from spectral and cepstral representations.
Usage Examples
Voice Activity Detection
from ix import ltsd_vad
fs, audio = wavfile.read('path_to_audio.wav')
vad_audio, vad_segments = ltsd_vad(audio, fs)
Spectral Analysis
from ix import stft, istft
fs, audio = wavfile.read('path_to_audio.wav')
spectrogram = stft(audio, fftsize=512)
reconstructed_audio = istft(spectrogram, fftsize=512)
Pitch Tracking
from ix import harvest
fs, audio = wavfile.read('path_to_audio.wav')
temporal_positions, f0, vuv, f0_candidates = harvest(audio, fs)
Feature Extraction
from ix import sp2mgc
fs, audio = wavfile.read('path_to_audio.wav')
spectrogram = stft(audio, fftsize=1024)
mgc_coefficients = sp2mgc(spectrogram, order=20, alpha=0.35, gamma=-0.41)
Audio Synthesis
from ix import world_synthesis
fs, audio = wavfile.read('path_to_audio.wav')
temporal_positions, f0, vuv, f0_candidates = harvest(audio, fs)
temporal_positions, spectrogram, fs = cheaptrick(audio, fs, temporal_positions, f0, vuv)
synthesized_audio = world_synthesis(f0, vuv, spectrogram, fs)
Documentation
Each function in the ix package includes a detailed docstring with an explanation of its parameters, return values, and an example of how to use it. For further details, refer to the docstrings in the source code.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ix-0.0.7.tar.gz.
File metadata
- Download URL: ix-0.0.7.tar.gz
- Upload date:
- Size: 44.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cd5f7c63b98ffec903313ea4eb927e0087107a1a54d76233cb6d8376090289ca
|
|
| MD5 |
24b67a861a876932afda671d8a5b133f
|
|
| BLAKE2b-256 |
b47abd6f465cd29c62e5e5283a52dba6407ea992aea5df17541fd452ebd5b3d6
|
File details
Details for the file ix-0.0.7-py3-none-any.whl.
File metadata
- Download URL: ix-0.0.7-py3-none-any.whl
- Upload date:
- Size: 43.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9b72961c5803ff297a73c62bac562711c23c762be06303bbb115e00d9a24185c
|
|
| MD5 |
1282d6876ef1dd2aca1042693cf06bc4
|
|
| BLAKE2b-256 |
9e2977ff11f790f973cf1fa948a08b6aa3e5be137f1793069d9ad8d27847fae4
|