Voice Acoustic Analyzer - Professional audio metrics extraction
Project description
Audio Metrics CLI
๐๏ธ Cross-platform audio analysis toolkit for speech metrics extraction
๐ Quick Start
Installation
# Install from PyPI (recommended)
pip install audio-metrics-cli
# Or install from source (development)
git clone https://github.com/i-whimsy/audio-metrics-cli.git
cd audio-metrics-cli
pip install -e ".[dev]"
Basic Usage
# Analyze audio file
audio-metrics analyze your_audio.wav --output result.json
# With verbose output
audio-metrics analyze audio.mp3 --verbose --show-progress
# Transcribe only
audio-metrics transcribe audio.m4a -o transcript.txt
# Compare two audio files
audio-metrics compare v1.wav v2.wav
๐ฆ Features
Core Metrics
- ๐ต Audio Information: Duration, sample rate, file size
- ๐ฃ๏ธ Voice Activity Detection: Speech/silence segmentation
- ๐ Speech-to-Text: Whisper-powered transcription
- ๐ผ Prosody Analysis: Pitch, energy, speech rate
- ๐ Emotion Recognition: Emotional state detection (optional)
- ๐ค Filler Word Detection: "um", "uh", "like" detection
Supported Formats
- โ WAV
- โ MP3
- โ M4A
- โ FLAC
- โ OGG
Cross-Platform
- โ Windows
- โ macOS
- โ Linux
๐ Documentation
Command Line Interface
analyze - Full Analysis
audio-metrics analyze audio.wav [OPTIONS]
Options:
-o, --output PATH Output JSON file path
-c, --config PATH Configuration file
-m, --model TEXT Whisper model (tiny/base/small/medium/large)
--no-emotion Skip emotion analysis
--show-progress Show progress bars
-v, --verbose Verbose output
--help Show this message
transcribe - Speech to Text
audio-metrics transcribe audio.wav [OPTIONS]
Options:
-o, --output PATH Output transcript file
-m, --model TEXT Whisper model
--language TEXT Language code
--help Show this message
compare - Compare Audio Files
audio-metrics compare audio1.wav audio2.wav [OPTIONS]
Options:
--format TEXT Output format (text/json/markdown)
--help Show this message
๐ Output Example
{
"audio_info": {
"duration_seconds": 185.2,
"sample_rate": 44100,
"file_size_mb": 2.8
},
"vad_analysis": {
"speech_ratio": 0.81,
"pause_count": 23,
"avg_pause_duration": 1.1
},
"speech_metrics": {
"words_total": 820,
"words_per_minute": 266
},
"prosody_metrics": {
"pitch_mean_hz": 145.3,
"energy_cv": 0.33
},
"filler_metrics": {
"filler_word_count": 18,
"fillers_per_100_words": 2.2
}
}
๐ง Configuration
Create a config.json file:
{
"models": {
"speech_to_text": {
"provider": "whisper",
"model": "base",
"device": "auto"
},
"vad": {
"provider": "silero",
"threshold": 0.5
}
},
"audio_analysis": {
"enable_pitch": true,
"enable_energy": true,
"enable_pause": true
},
"features": {
"enable_emotion": true,
"skip_if_too_long": 3600
}
}
๐ป Development
Setup Development Environment
# Clone repository
git clone https://github.com/i-whimsy/audio-metrics-cli.git
cd audio-metrics-cli
# Install with dev dependencies
pip install -e ".[dev]"
# Run tests
pytest tests/
# Format code
black src/
ruff check src/
Project Structure
audio-metrics-cli/
โโโ src/
โ โโโ audio_metrics/
โ โโโ cli.py # CLI entry point
โ โโโ config.py # Configuration
โ โโโ modules/ # Core modules
โ โโโ audio_loader.py
โ โโโ vad_analyzer.py
โ โโโ speech_to_text.py
โ โโโ prosody_analyzer.py
โ โโโ emotion_analyzer.py
โ โโโ filler_detector.py
โ โโโ metrics_builder.py
โ โโโ json_exporter.py
โโโ tests/
โโโ examples/
โโโ pyproject.toml
โโโ README.md
๐ค Contributing
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
๐ Acknowledgments
- OpenAI Whisper - Speech-to-text
- Silero VAD - Voice activity detection
- Librosa - Audio analysis
- SpeechBrain - Emotion recognition
๐ Support
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: clawbot@openclaw.ai
Built with โค๏ธ by OpenClaw Team
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file audio_metrics_cli-0.3.1.tar.gz.
File metadata
- Download URL: audio_metrics_cli-0.3.1.tar.gz
- Upload date:
- Size: 37.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
031362a5feef0251ea2f259c6fdb9f6c2f0becc6b2a40fb1083b08e821994e3c
|
|
| MD5 |
30353eefe31ceb66c212860381b72947
|
|
| BLAKE2b-256 |
db16c576d594bcffedbabaf8a8ad83e59d6ef28a323ca94c5be537e84c3bc7ff
|
File details
Details for the file audio_metrics_cli-0.3.1-py3-none-any.whl.
File metadata
- Download URL: audio_metrics_cli-0.3.1-py3-none-any.whl
- Upload date:
- Size: 42.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8ca5a64d24586b3d38cf2946dcbe648f804c50355a39dc7f315486fbecd65680
|
|
| MD5 |
da9958bdec6a6ccb65c2e55144dcb3c8
|
|
| BLAKE2b-256 |
b1115713a3c92a37aaa9f6e0d1ba1ae91e25da53ee2c267177a8c73870040b6b
|