Skip to main content

Sonix: extract rich analytical signals directly from audio files

Project description

Sonix

Audio-Based Conversation Analysis



๐ŸŽง Sonix โ€” Audio-Based Conversation Analysis

Sonix is a Python library designed to extract rich analytical signals directly from audio files โ€” without relying on transcripts or text analysis. It focuses purely on acoustic and prosodic features to help researchers, developers, and data scientists understand conversational dynamics, emotional tone, and speaking patterns.


๐Ÿš€ Features

Sonix provides end-to-end analysis of raw audio conversations, including:

Category Description Example Metrics
๐ŸŽš๏ธ Basic Audio Stats Extracts simple sound metrics useful for quality and consistency checks. RMS Energy, Duration, Silence Ratio
๐Ÿ—ฃ๏ธ Voice Activity Detection (VAD) Identifies speaking vs. silence segments. Speech Segments, Turn Counts
๐ŸŽต Pitch & Prosody Analyzes intonation and variation in tone. Average Pitch, Pitch Variance
๐Ÿ’ฌ Speech Tempo Measures speaking rate and rhythm. Words per Minute (approx), Speech Rate
๐Ÿ”Š Energy Dynamics Examines loudness variation to detect emphasis or excitement. Mean Energy, Energy Variability
๐Ÿ˜  Emotion & Tone Estimation Classifies emotional states using pretrained acoustic models. Calm, Happy, Angry, Sad
โฑ๏ธ Overlap & Turn-Taking Detects interruptions and conversational overlap. Speaker Overlap %, Turn Durations
๐ŸŒˆ Spectral Features Extracts frequency-domain data for ML and acoustic analysis. MFCCs, Spectral Centroid, Roll-off
๐ŸŽฏ Audio Quality Evaluates clarity and background noise. SNR (Signal-to-Noise Ratio), Distortion
๐Ÿง  Derived Conversation Insights Combines features to infer interaction quality. Engagement Index, Talk/Listen Ratio

๐Ÿงฉ Example Use Cases

  • Conversation analytics for call centers or AI voice agents
  • Measuring emotional tone or stress levels in speech
  • Detecting dominance or interruptions in meetings
  • Generating audio-based KPIs for humanโ€“AI interactions
  • Building real-time feedback tools for voice communication training

๐Ÿ› ๏ธ Installation

pip install Sonix

๐Ÿงช Quick Start

from Sonix import AudioAnalyzer

# Load and analyze an audio file
analyzer = AudioAnalyzer("conversation.wav")

# Run full analysis
report = analyzer.analyze_all()

# Print summary
print(report.summary())

# Access individual feature groups
print(report.pitch.mean)
print(report.energy.variance)
print(report.emotion.probabilities)

๐Ÿ“Š Example Output

{
  "duration_sec": 180.4,
  "speech_segments": 56,
  "average_pitch_hz": 201.3,
  "pitch_variance": 32.8,
  "mean_energy": -20.5,
  "energy_variability": 0.17,
  "speech_rate_wpm": 142,
  "emotion": {
    "calm": 0.52,
    "happy": 0.28,
    "angry": 0.12,
    "sad": 0.08
  },
  "overlap_ratio": 0.07,
  "engagement_index": 0.81
}

โš™๏ธ Architecture Overview

Sonix/
โ”‚
โ”œโ”€โ”€ core/
โ”‚   โ”œโ”€โ”€ audio_loader.py         # Handles input normalization, channel merging
โ”‚   โ”œโ”€โ”€ feature_extractor.py    # Extracts MFCCs, pitch, energy, etc.
โ”‚   โ”œโ”€โ”€ vad_detector.py         # Voice activity segmentation
โ”‚   โ”œโ”€โ”€ prosody_analyzer.py     # Pitch, tone, tempo analysis
โ”‚   โ”œโ”€โ”€ emotion_estimator.py    # Acoustic emotion classification
โ”‚   โ”œโ”€โ”€ quality_metrics.py      # Noise and clarity estimation
โ”‚   โ””โ”€โ”€ report_builder.py       # Combines results into structured JSON
โ”‚
โ”œโ”€โ”€ models/
โ”‚   โ”œโ”€โ”€ emotion_model.onnx
โ”‚   โ””โ”€โ”€ vad_model.onnx
โ”‚
โ””โ”€โ”€ utils/
    โ”œโ”€โ”€ visualization.py        # Waveform and spectrum plots
    โ””โ”€โ”€ audio_helpers.py

๐Ÿงฌ Technical Dependencies

Library Purpose
librosa Audio feature extraction
pyAudioAnalysis Energy and tempo metrics
webrtcvad Voice activity detection
torch / onnxruntime Emotion classification models
numpy / scipy Signal processing
matplotlib Visualization (optional)

๐Ÿง  Example Analysis Pipeline

from Sonix.pipeline import AudioPipeline

pipeline = AudioPipeline([
  "vad",
  "pitch",
  "energy",
  "emotion",
  "turn_taking"
])

results = pipeline.run("meeting.wav")
results.plot_waveform()

๐Ÿ“ˆ Metrics Summary

Metric Description
duration_sec Total length of the audio file
speech_segments Number of detected speaking parts
speech_rate_wpm Approximate speech tempo
average_pitch_hz Average fundamental frequency
energy_variability Standard deviation of loudness
overlap_ratio % of overlapping speech between speakers
emotion Probabilities of acoustic emotion categories
engagement_index Combined measure of voice energy, tempo, and tone consistency

๐Ÿงฐ CLI Usage

Sonix analyze conversation.wav --plot

Output includes JSON summary + waveform visualization.


๐Ÿง‘โ€๐Ÿ’ป Example Integration

Integrate with your AI or analytics platform:

from Sonix import AudioAnalyzer

analyzer = AudioAnalyzer("agent_call.wav")
signals = analyzer.get_signals()
agent_metrics = {
  "engagement": signals["engagement_index"],
  "emotion": signals["emotion"]["happy"],
  "speech_rate": signals["speech_rate_wpm"]
}

๐Ÿ” Future Roadmap

  • Real-time streaming analysis
  • Speaker diarization and identification
  • Gender and age acoustic profiling
  • Conversation quality score model
  • REST API integration

๐Ÿง‘โ€๐Ÿ”ฌ Author

Developed by Dima Statz & Contributors
๐Ÿ“ซ Contributions welcome via pull requests and GitHub issues.


๐Ÿ“„ License

MIT License ยฉ 2025 Sonix Team

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sonix-0.1.1.tar.gz (5.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sonix-0.1.1-py3-none-any.whl (5.1 kB view details)

Uploaded Python 3

File details

Details for the file sonix-0.1.1.tar.gz.

File metadata

  • Download URL: sonix-0.1.1.tar.gz
  • Upload date:
  • Size: 5.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for sonix-0.1.1.tar.gz
Algorithm Hash digest
SHA256 ce64eec13cdf42734c10fbc9e2f4cffaedde43a28e308a16ba90d4aab8fe8ab3
MD5 a35c428120f2f5fefcd70db5e7579639
BLAKE2b-256 0c0fb805f2ba6ef146e5e2353df338e43895b9b45476737e97cad9649bfd7fe0

See more details on using hashes here.

File details

Details for the file sonix-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: sonix-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 5.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for sonix-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3b3b00f40b3d36363db497ea04a648027ef61910328b06c849c6a2182a9c71c9
MD5 514004ef084a1a9e256e7f3bc8a8f77a
BLAKE2b-256 82ced2ef98c5c88914db2c8117cbfd7619a577f85f9e8f9bf13d6f9309e5d17c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page