a simple audio feature extraction tool
Project description
Audio Feature Extractor
A simple and extensible tool for extracting audio features, designed for speech and audio experiments.
Features
- Extracts traditional audio features (pitch, timbre, loudness, etc.)
- Supports English and Chinese G2P (grapheme-to-phoneme) conversion
- Embedding and semantic feature extraction (e.g., speaker embeddings, ASR, etc.)
- Modular design for easy extension
Installation
-
Clone the repository:
git clone <your-repo-url> cd audio_feature_extractor
-
Install dependencies:
pip install .
Note: For GPU support, install the appropriate version of PyTorch for your CUDA version before running
pip install .. See PyTorch Get Started for details.
Usage
from audio_feature_extractor import FeatureExtractor
extractor = FeatureExtractor()
audio_path = "tmp/test.wav"
features = extractor.extract_features(audio_path)
print(features)
Project Structure
src/audio_feature_extractor/
extractor.py
features/
traditional.py
embedding.py
semantic.py
g2p/
Notes
- This tool is intended for research and experimental use.
- For large models or neural network weights, see the documentation for download instructions.
- If you encounter CUDA/cuDNN or dependency issues, please refer to the Troubleshooting section below.
Troubleshooting
- CUDA/cuDNN errors:
Make sure your environment variables (e.g.,LD_LIBRARY_PATH) are set correctly and that you have installed the correct CUDA/cuDNN versions. - PyTorch version:
Install the correct PyTorch version for your hardware and CUDA version before installing this package.
License
Changelog
v0.1.0
- Initial release: basic usage for English speech feature extraction.
Feel free to contribute or open issues!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file audio_feature_extractor-0.1.1.tar.gz.
File metadata
- Download URL: audio_feature_extractor-0.1.1.tar.gz
- Upload date:
- Size: 9.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5e7e03b1c17604d23a2d853fda416a69c76e0388410949f43ac0e5c22c7d219d
|
|
| MD5 |
40d368f659bc8fb7b4ea435f0b5d4559
|
|
| BLAKE2b-256 |
30edd1d76cc639932781688427359e7be56c042ac2ca9dfa995d343f4ab4d778
|
File details
Details for the file audio_feature_extractor-0.1.1-py3-none-any.whl.
File metadata
- Download URL: audio_feature_extractor-0.1.1-py3-none-any.whl
- Upload date:
- Size: 11.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
34ea8c5d07e4c429271eed5f6d99f440e0dcce529c2e76db9a4a59dc7e0f7cd6
|
|
| MD5 |
c396d92d903bcbf8a44d31c6ef699d04
|
|
| BLAKE2b-256 |
f4a44ff53ca01ebd90119dd00067522d07145fa85ea29811387f7e0545282b97
|