Sonix: extract rich analytical signals directly from audio files

Project description

Sonix

Audio-Based Conversation Analysis

🎧 Sonix — Audio-Based Conversation Analysis

Sonix is a Python library designed to extract rich analytical signals directly from audio files — without relying on transcripts or text analysis. It focuses purely on acoustic and prosodic features to help researchers, developers, and data scientists understand conversational dynamics, emotional tone, and speaking patterns.

🚀 Features

Sonix provides end-to-end analysis of raw audio conversations, including:

Category	Description	Example Metrics
🎚️ Basic Audio Stats	Extracts simple sound metrics useful for quality and consistency checks.	RMS Energy, Duration, Silence Ratio
🗣️ Voice Activity Detection (VAD)	Identifies speaking vs. silence segments.	Speech Segments, Turn Counts
🎵 Pitch & Prosody	Analyzes intonation and variation in tone.	Average Pitch, Pitch Variance
💬 Speech Tempo	Measures speaking rate and rhythm.	Words per Minute (approx), Speech Rate
🔊 Energy Dynamics	Examines loudness variation to detect emphasis or excitement.	Mean Energy, Energy Variability
😠 Emotion & Tone Estimation	Classifies emotional states using pretrained acoustic models.	Calm, Happy, Angry, Sad
⏱️ Overlap & Turn-Taking	Detects interruptions and conversational overlap.	Speaker Overlap %, Turn Durations
🌈 Spectral Features	Extracts frequency-domain data for ML and acoustic analysis.	MFCCs, Spectral Centroid, Roll-off
🎯 Audio Quality	Evaluates clarity and background noise.	SNR (Signal-to-Noise Ratio), Distortion
🧠 Derived Conversation Insights	Combines features to infer interaction quality.	Engagement Index, Talk/Listen Ratio

🧩 Example Use Cases

Conversation analytics for call centers or AI voice agents
Measuring emotional tone or stress levels in speech
Detecting dominance or interruptions in meetings
Generating audio-based KPIs for human–AI interactions
Building real-time feedback tools for voice communication training

🛠️ Installation

pip install Sonix

🧪 Quick Start

from Sonix import AudioAnalyzer

# Load and analyze an audio file
analyzer = AudioAnalyzer("conversation.wav")

# Run full analysis
report = analyzer.analyze_all()

# Print summary
print(report.summary())

# Access individual feature groups
print(report.pitch.mean)
print(report.energy.variance)
print(report.emotion.probabilities)

📊 Example Output

{
  "duration_sec": 180.4,
  "speech_segments": 56,
  "average_pitch_hz": 201.3,
  "pitch_variance": 32.8,
  "mean_energy": -20.5,
  "energy_variability": 0.17,
  "speech_rate_wpm": 142,
  "emotion": {
    "calm": 0.52,
    "happy": 0.28,
    "angry": 0.12,
    "sad": 0.08
  },
  "overlap_ratio": 0.07,
  "engagement_index": 0.81
}

⚙️ Architecture Overview

Sonix/
│
├── core/
│   ├── audio_loader.py         # Handles input normalization, channel merging
│   ├── feature_extractor.py    # Extracts MFCCs, pitch, energy, etc.
│   ├── vad_detector.py         # Voice activity segmentation
│   ├── prosody_analyzer.py     # Pitch, tone, tempo analysis
│   ├── emotion_estimator.py    # Acoustic emotion classification
│   ├── quality_metrics.py      # Noise and clarity estimation
│   └── report_builder.py       # Combines results into structured JSON
│
├── models/
│   ├── emotion_model.onnx
│   └── vad_model.onnx
│
└── utils/
    ├── visualization.py        # Waveform and spectrum plots
    └── audio_helpers.py

🧬 Technical Dependencies

Library	Purpose
librosa	Audio feature extraction
pyAudioAnalysis	Energy and tempo metrics
webrtcvad	Voice activity detection
torch / onnxruntime	Emotion classification models
numpy / scipy	Signal processing
matplotlib	Visualization (optional)

🧠 Example Analysis Pipeline

from Sonix.pipeline import AudioPipeline

pipeline = AudioPipeline([
  "vad",
  "pitch",
  "energy",
  "emotion",
  "turn_taking"
])

results = pipeline.run("meeting.wav")
results.plot_waveform()

📈 Metrics Summary

Metric	Description
`duration_sec`	Total length of the audio file
`speech_segments`	Number of detected speaking parts
`speech_rate_wpm`	Approximate speech tempo
`average_pitch_hz`	Average fundamental frequency
`energy_variability`	Standard deviation of loudness
`overlap_ratio`	% of overlapping speech between speakers
`emotion`	Probabilities of acoustic emotion categories
`engagement_index`	Combined measure of voice energy, tempo, and tone consistency

🧰 CLI Usage

Sonix analyze conversation.wav --plot

Output includes JSON summary + waveform visualization.

🧑‍💻 Example Integration

Integrate with your AI or analytics platform:

from Sonix import AudioAnalyzer

analyzer = AudioAnalyzer("agent_call.wav")
signals = analyzer.get_signals()
agent_metrics = {
  "engagement": signals["engagement_index"],
  "emotion": signals["emotion"]["happy"],
  "speech_rate": signals["speech_rate_wpm"]
}

🔍 Future Roadmap

Real-time streaming analysis
Speaker diarization and identification
Gender and age acoustic profiling
Conversation quality score model
REST API integration

🧑‍🔬 Author

Developed by Dima Statz & Contributors
📫 Contributions welcome via pull requests and GitHub issues.

📄 License

Project details

Release history Release notifications | RSS feed

This version

0.1.1

Dec 9, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sonix-0.1.1.tar.gz (5.0 kB view details)

Uploaded Dec 9, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sonix-0.1.1-py3-none-any.whl (5.1 kB view details)

Uploaded Dec 9, 2025 Python 3

File details

Details for the file sonix-0.1.1.tar.gz.

File metadata

Download URL: sonix-0.1.1.tar.gz
Upload date: Dec 9, 2025
Size: 5.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for sonix-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`ce64eec13cdf42734c10fbc9e2f4cffaedde43a28e308a16ba90d4aab8fe8ab3`
MD5	`a35c428120f2f5fefcd70db5e7579639`
BLAKE2b-256	`0c0fb805f2ba6ef146e5e2353df338e43895b9b45476737e97cad9649bfd7fe0`

See more details on using hashes here.

File details

Details for the file sonix-0.1.1-py3-none-any.whl.

File metadata

Download URL: sonix-0.1.1-py3-none-any.whl
Upload date: Dec 9, 2025
Size: 5.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for sonix-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3b3b00f40b3d36363db497ea04a648027ef61910328b06c849c6a2182a9c71c9`
MD5	`514004ef084a1a9e256e7f3bc8a8f77a`
BLAKE2b-256	`82ced2ef98c5c88914db2c8117cbfd7619a577f85f9e8f9bf13d6f9309e5d17c`

See more details on using hashes here.

sonix 0.1.1

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Project description

Sonix

Audio-Based Conversation Analysis

🎧 Sonix — Audio-Based Conversation Analysis

🚀 Features

🧩 Example Use Cases

🛠️ Installation

🧪 Quick Start

📊 Example Output

⚙️ Architecture Overview

🧬 Technical Dependencies

🧠 Example Analysis Pipeline

📈 Metrics Summary

🧰 CLI Usage

🧑‍💻 Example Integration

🔍 Future Roadmap

🧑‍🔬 Author

📄 License

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes