Sonix: extract rich analytical signals directly from audio files
Project description
Sonix
Audio-Based Conversation Analysis
๐ง Sonix โ Audio-Based Conversation Analysis
Sonix is a Python library designed to extract rich analytical signals directly from audio files โ without relying on transcripts or text analysis. It focuses purely on acoustic and prosodic features to help researchers, developers, and data scientists understand conversational dynamics, emotional tone, and speaking patterns.
๐ Features
Sonix provides end-to-end analysis of raw audio conversations, including:
| Category | Description | Example Metrics |
|---|---|---|
| ๐๏ธ Basic Audio Stats | Extracts simple sound metrics useful for quality and consistency checks. | RMS Energy, Duration, Silence Ratio |
| ๐ฃ๏ธ Voice Activity Detection (VAD) | Identifies speaking vs. silence segments. | Speech Segments, Turn Counts |
| ๐ต Pitch & Prosody | Analyzes intonation and variation in tone. | Average Pitch, Pitch Variance |
| ๐ฌ Speech Tempo | Measures speaking rate and rhythm. | Words per Minute (approx), Speech Rate |
| ๐ Energy Dynamics | Examines loudness variation to detect emphasis or excitement. | Mean Energy, Energy Variability |
| ๐ Emotion & Tone Estimation | Classifies emotional states using pretrained acoustic models. | Calm, Happy, Angry, Sad |
| โฑ๏ธ Overlap & Turn-Taking | Detects interruptions and conversational overlap. | Speaker Overlap %, Turn Durations |
| ๐ Spectral Features | Extracts frequency-domain data for ML and acoustic analysis. | MFCCs, Spectral Centroid, Roll-off |
| ๐ฏ Audio Quality | Evaluates clarity and background noise. | SNR (Signal-to-Noise Ratio), Distortion |
| ๐ง Derived Conversation Insights | Combines features to infer interaction quality. | Engagement Index, Talk/Listen Ratio |
๐งฉ Example Use Cases
- Conversation analytics for call centers or AI voice agents
- Measuring emotional tone or stress levels in speech
- Detecting dominance or interruptions in meetings
- Generating audio-based KPIs for humanโAI interactions
- Building real-time feedback tools for voice communication training
๐ ๏ธ Installation
pip install Sonix
๐งช Quick Start
from Sonix import AudioAnalyzer
# Load and analyze an audio file
analyzer = AudioAnalyzer("conversation.wav")
# Run full analysis
report = analyzer.analyze_all()
# Print summary
print(report.summary())
# Access individual feature groups
print(report.pitch.mean)
print(report.energy.variance)
print(report.emotion.probabilities)
๐ Example Output
{
"duration_sec": 180.4,
"speech_segments": 56,
"average_pitch_hz": 201.3,
"pitch_variance": 32.8,
"mean_energy": -20.5,
"energy_variability": 0.17,
"speech_rate_wpm": 142,
"emotion": {
"calm": 0.52,
"happy": 0.28,
"angry": 0.12,
"sad": 0.08
},
"overlap_ratio": 0.07,
"engagement_index": 0.81
}
โ๏ธ Architecture Overview
Sonix/
โ
โโโ core/
โ โโโ audio_loader.py # Handles input normalization, channel merging
โ โโโ feature_extractor.py # Extracts MFCCs, pitch, energy, etc.
โ โโโ vad_detector.py # Voice activity segmentation
โ โโโ prosody_analyzer.py # Pitch, tone, tempo analysis
โ โโโ emotion_estimator.py # Acoustic emotion classification
โ โโโ quality_metrics.py # Noise and clarity estimation
โ โโโ report_builder.py # Combines results into structured JSON
โ
โโโ models/
โ โโโ emotion_model.onnx
โ โโโ vad_model.onnx
โ
โโโ utils/
โโโ visualization.py # Waveform and spectrum plots
โโโ audio_helpers.py
๐งฌ Technical Dependencies
| Library | Purpose |
|---|---|
| librosa | Audio feature extraction |
| pyAudioAnalysis | Energy and tempo metrics |
| webrtcvad | Voice activity detection |
| torch / onnxruntime | Emotion classification models |
| numpy / scipy | Signal processing |
| matplotlib | Visualization (optional) |
๐ง Example Analysis Pipeline
from Sonix.pipeline import AudioPipeline
pipeline = AudioPipeline([
"vad",
"pitch",
"energy",
"emotion",
"turn_taking"
])
results = pipeline.run("meeting.wav")
results.plot_waveform()
๐ Metrics Summary
| Metric | Description |
|---|---|
duration_sec |
Total length of the audio file |
speech_segments |
Number of detected speaking parts |
speech_rate_wpm |
Approximate speech tempo |
average_pitch_hz |
Average fundamental frequency |
energy_variability |
Standard deviation of loudness |
overlap_ratio |
% of overlapping speech between speakers |
emotion |
Probabilities of acoustic emotion categories |
engagement_index |
Combined measure of voice energy, tempo, and tone consistency |
๐งฐ CLI Usage
Sonix analyze conversation.wav --plot
Output includes JSON summary + waveform visualization.
๐งโ๐ป Example Integration
Integrate with your AI or analytics platform:
from Sonix import AudioAnalyzer
analyzer = AudioAnalyzer("agent_call.wav")
signals = analyzer.get_signals()
agent_metrics = {
"engagement": signals["engagement_index"],
"emotion": signals["emotion"]["happy"],
"speech_rate": signals["speech_rate_wpm"]
}
๐ Future Roadmap
- Real-time streaming analysis
- Speaker diarization and identification
- Gender and age acoustic profiling
- Conversation quality score model
- REST API integration
๐งโ๐ฌ Author
Developed by Dima Statz & Contributors
๐ซ Contributions welcome via pull requests and GitHub issues.
๐ License
MIT License ยฉ 2025 Sonix Team
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sonix-0.1.1.tar.gz.
File metadata
- Download URL: sonix-0.1.1.tar.gz
- Upload date:
- Size: 5.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ce64eec13cdf42734c10fbc9e2f4cffaedde43a28e308a16ba90d4aab8fe8ab3
|
|
| MD5 |
a35c428120f2f5fefcd70db5e7579639
|
|
| BLAKE2b-256 |
0c0fb805f2ba6ef146e5e2353df338e43895b9b45476737e97cad9649bfd7fe0
|
File details
Details for the file sonix-0.1.1-py3-none-any.whl.
File metadata
- Download URL: sonix-0.1.1-py3-none-any.whl
- Upload date:
- Size: 5.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3b3b00f40b3d36363db497ea04a648027ef61910328b06c849c6a2182a9c71c9
|
|
| MD5 |
514004ef084a1a9e256e7f3bc8a8f77a
|
|
| BLAKE2b-256 |
82ced2ef98c5c88914db2c8117cbfd7619a577f85f9e8f9bf13d6f9309e5d17c
|