Human vs AI audio detection via Shannon entropy features

These details have not been verified by PyPI

Project description

AudioSentinel

AudioSentinel detects whether an audio file is human-recorded or AI-generated, using Shannon entropy features and a Random Forest classifier.

✅ 100% accuracy on 294-sample blind test
✅ 30/30 cross-verified on held-out samples
✅ Lightweight — no GPU required, runs on CPU in <1s per file
✅ 52 handcrafted features: temporal, spectral & phase entropy + MFCC + spectral descriptors

Install

pip install audiosentinel

Or from source:

git clone https://github.com/yourname/audiosentinel
cd audiosentinel
pip install -e .

Quick Start

from audiosentinel import predict_audio, predict_int, predict_batch

# Full result with confidence
predict_audio('recording.wav')
# File       : recording.wav
# Result     : HUMAN
# Confidence : 74.8%
# P(AI)=0.252  P(Human)=0.748

# Integer only — 0=AI, 1=Human
label = predict_int('recording.wav')
print(label)  # 1

# Batch
import glob
results = predict_batch(glob.glob('audio/*.wav'))
for r in results:
    print(r['label'], r['prob_human'])

CLI

audiosentinel recording.wav
audiosentinel path/to/audio/*.wav

API Reference

`predict_audio(path, verbose=True) → dict`

Key	Type	Description
`label`	str	`"HUMAN"` or `"AI"`
`pred`	int	`1` = Human, `0` = AI
`prob_ai`	float	Probability of AI origin
`prob_human`	float	Probability of Human origin

`predict_int(path) → int`

Returns 0 (AI) or 1 (Human) only.

`predict_batch(paths, verbose=False) → list[dict]`

Runs inference on a list of WAV paths. Returns None for failed files.

`extract_entropy_features(path) → dict`

Returns raw 52-feature dict for a WAV file.

How It Works

Audio loaded at 24kHz, silence trimmed
Temporal entropy — Shannon entropy over time-domain frames
Spectral entropy — Shannon entropy over STFT magnitude frames
Phase entropy — Shannon entropy over STFT phase frames
MFCC — 13 coefficients × mean + std = 26 features
Spectral descriptors — ZCR, RMS, centroid, rolloff × mean + std = 8 features
Random Forest (200 trees) classifies the 52-feature vector

Training Data

Source	Class	Samples
LibriSpeech	Human	1,500
Kokoro TTS	AI	1,500

Sample rate: 24kHz — all samples resampled internally.

Performance

Model	CV Accuracy
LogReg (3-feat)	83.2%
LogReg (all)	94.7%
Gradient Boost	95.5%
Random Forest	96.7%
Tuned RF (final)	100.0%

Blind test (294 samples, unseen): 100% — 0 misclassifications

Limitations

Trained on Kokoro TTS — confidence may vary on other TTS engines
Best performance on speech audio; music/noise not tested
Requires scikit-learn==1.6.1 to match model pickle version

Support This Work

If AudioSentinel is useful to you, consider buying a coffee or supporting development:

☕ Buy Me a Coffee: https://buymeacoffee.com
🤝 GitHub Sponsors: https://github.com/sponsors

Crypto donations welcome:

Chain	Address
BTC	`bc1qxz2qgfkh0fgs7ff3m0ft6wtluzk5rqhv472vws`
ETH	`0x70282a83f0d6ef2f207d252cf3f7874c7663f625`
SOL	`91s2TYpn5P2W5xXyEk3q8nFPusY937YEiCNdFCKiYirz`
LTC	`ltc1qfcucqw08kus6vncc8egft7feswgflp0wee7rxj`

License

MIT — see LICENSE

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

May 8, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

audiosentinel-0.1.0.tar.gz (200.8 kB view details)

Uploaded May 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

audiosentinel-0.1.0-py3-none-any.whl (206.8 kB view details)

Uploaded May 8, 2026 Python 3

File details

Details for the file audiosentinel-0.1.0.tar.gz.

File metadata

Download URL: audiosentinel-0.1.0.tar.gz
Upload date: May 8, 2026
Size: 200.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for audiosentinel-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`89d86c73c8c117bd2ef6a048f82763c41855ec3f7dfcb1e1b49bfb43461ac9fe`
MD5	`1d1c4aef8c6e1207a425c82dc8d159dc`
BLAKE2b-256	`2328e3172187579eff7c121514070e90445e6a1e43aa54829cfdac744957bb1c`

See more details on using hashes here.

File details

Details for the file audiosentinel-0.1.0-py3-none-any.whl.

File metadata

Download URL: audiosentinel-0.1.0-py3-none-any.whl
Upload date: May 8, 2026
Size: 206.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for audiosentinel-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3f972a106e4df0940e639afb92d0f77c218547d0a946a1433c3a1f4ae8f0db24`
MD5	`5867830911e375b74ebf734935bd81ff`
BLAKE2b-256	`a562c443f48ac17ba9f69a16f7c0e529b979335449c0e300e4a9789145f865c7`

See more details on using hashes here.

audiosentinel 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

AudioSentinel

Install

Quick Start

CLI

API Reference

`predict_audio(path, verbose=True) → dict`

`predict_int(path) → int`

`predict_batch(paths, verbose=False) → list[dict]`

`extract_entropy_features(path) → dict`

How It Works

Training Data

Performance

Limitations

Support This Work

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes