Cross-platform audio analysis toolkit for speech metrics extraction

These details have not been verified by PyPI

Project links

Project description

Audio Metrics CLI

🎙️ Cross-platform audio analysis toolkit for speech metrics extraction

🚀 Quick Start

Installation

# Install from PyPI (recommended)
pip install audio-metrics-cli

# Or install from source (development)
git clone https://github.com/i-whimsy/audio-metrics-cli.git
cd audio-metrics-cli
pip install -e ".[dev]"

Basic Usage

# Analyze audio file
audio-metrics analyze your_audio.wav --output result.json

# With verbose output
audio-metrics analyze audio.mp3 --verbose --show-progress

# Transcribe only
audio-metrics transcribe audio.m4a -o transcript.txt

# Compare two audio files
audio-metrics compare v1.wav v2.wav

📦 Features

Core Metrics

🎵 Audio Information: Duration, sample rate, file size
🗣️ Voice Activity Detection: Speech/silence segmentation
📝 Speech-to-Text: Whisper-powered transcription
🎼 Prosody Analysis: Pitch, energy, speech rate
😊 Emotion Recognition: Emotional state detection (optional)
🔤 Filler Word Detection: "um", "uh", "like" detection

Supported Formats

✅ WAV
✅ MP3
✅ M4A
✅ FLAC
✅ OGG

Cross-Platform

✅ Windows
✅ macOS
✅ Linux

📖 Documentation

Command Line Interface

`analyze` - Full Analysis

audio-metrics analyze audio.wav [OPTIONS]

Options:
  -o, --output PATH        Output JSON file path
  -c, --config PATH        Configuration file
  -m, --model TEXT         Whisper model (tiny/base/small/medium/large)
  --no-emotion            Skip emotion analysis
  --show-progress         Show progress bars
  -v, --verbose           Verbose output
  --help                  Show this message

`transcribe` - Speech to Text

audio-metrics transcribe audio.wav [OPTIONS]

Options:
  -o, --output PATH       Output transcript file
  -m, --model TEXT        Whisper model
  --language TEXT         Language code
  --help                  Show this message

`compare` - Compare Audio Files

audio-metrics compare audio1.wav audio2.wav [OPTIONS]

Options:
  --format TEXT          Output format (text/json/markdown)
  --help                 Show this message

📊 Output Example

{
  "audio_info": {
    "duration_seconds": 185.2,
    "sample_rate": 44100,
    "file_size_mb": 2.8
  },
  "vad_analysis": {
    "speech_ratio": 0.81,
    "pause_count": 23,
    "avg_pause_duration": 1.1
  },
  "speech_metrics": {
    "words_total": 820,
    "words_per_minute": 266
  },
  "prosody_metrics": {
    "pitch_mean_hz": 145.3,
    "energy_cv": 0.33
  },
  "filler_metrics": {
    "filler_word_count": 18,
    "fillers_per_100_words": 2.2
  }
}

🔧 Configuration

Create a config.json file:

{
  "models": {
    "speech_to_text": {
      "provider": "whisper",
      "model": "base",
      "device": "auto"
    },
    "vad": {
      "provider": "silero",
      "threshold": 0.5
    }
  },
  "audio_analysis": {
    "enable_pitch": true,
    "enable_energy": true,
    "enable_pause": true
  },
  "features": {
    "enable_emotion": true,
    "skip_if_too_long": 3600
  }
}

💻 Development

Setup Development Environment

# Clone repository
git clone https://github.com/i-whimsy/audio-metrics-cli.git
cd audio-metrics-cli

# Install with dev dependencies
pip install -e ".[dev]"

# Run tests
pytest tests/

# Format code
black src/
ruff check src/

Project Structure

audio-metrics-cli/
├── src/
│   └── audio_metrics/
│       ├── cli.py              # CLI entry point
│       ├── config.py           # Configuration
│       └── modules/            # Core modules
│           ├── audio_loader.py
│           ├── vad_analyzer.py
│           ├── speech_to_text.py
│           ├── prosody_analyzer.py
│           ├── emotion_analyzer.py
│           ├── filler_detector.py
│           ├── metrics_builder.py
│           └── json_exporter.py
├── tests/
├── examples/
├── pyproject.toml
└── README.md

🤝 Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

OpenAI Whisper - Speech-to-text
Silero VAD - Voice activity detection
Librosa - Audio analysis
SpeechBrain - Emotion recognition

📞 Support

Issues: GitHub Issues
Discussions: GitHub Discussions
Email: clawbot@openclaw.ai

Built with ❤️ by OpenClaw Team

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.4.0

Mar 29, 2026

0.3.1

Mar 8, 2026

0.2.0

Mar 8, 2026

This version

0.1.0

Mar 8, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

audio_metrics_cli-0.1.0.tar.gz (21.5 kB view details)

Uploaded Mar 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

audio_metrics_cli-0.1.0-py3-none-any.whl (22.8 kB view details)

Uploaded Mar 8, 2026 Python 3

File details

Details for the file audio_metrics_cli-0.1.0.tar.gz.

File metadata

Download URL: audio_metrics_cli-0.1.0.tar.gz
Upload date: Mar 8, 2026
Size: 21.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for audio_metrics_cli-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`90e85def0e3cdf9a67280ec5d4b743128d39bdd2f8420c0e23a3de7b204ec974`
MD5	`b9a71a009392a5fa141313b0385bf44d`
BLAKE2b-256	`ff8b6eea2526a215caf006aa7be013773019c1d27dd02ffee46b6dc0c7ff7939`

See more details on using hashes here.

File details

Details for the file audio_metrics_cli-0.1.0-py3-none-any.whl.

File metadata

Download URL: audio_metrics_cli-0.1.0-py3-none-any.whl
Upload date: Mar 8, 2026
Size: 22.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for audio_metrics_cli-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`35058f1b1f4b1115cdf039ccbffb5c20fb8ecb165593f336fed1960cb1ee501a`
MD5	`237440af48d18bbfce262df64eb456c6`
BLAKE2b-256	`3b4b1ca5268de7f9141562a251aa01967439e0b9df049de6718eda45b7f61900`

See more details on using hashes here.

audio-metrics-cli 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Audio Metrics CLI

🚀 Quick Start

Installation

Basic Usage

📦 Features

Core Metrics

Supported Formats

Cross-Platform

📖 Documentation

Command Line Interface

analyze - Full Analysis

transcribe - Speech to Text

compare - Compare Audio Files

📊 Output Example

🔧 Configuration

💻 Development

Setup Development Environment

Project Structure

🤝 Contributing

📝 License

🙏 Acknowledgments

📞 Support

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`analyze` - Full Analysis

`transcribe` - Speech to Text

`compare` - Compare Audio Files