Skip to main content

Advanced FLAC authenticity analyzer - Detects MP3-to-FLAC transcodes with high precision

Project description

๐ŸŽต FLAC Detective

FLAC Detective Banner

Python Version PyPI version PyPI Downloads License Status codecov Code style: black Pre-commit

Advanced FLAC Authenticity Analyzer for Detecting MP3-to-FLAC Transcodes

FLAC Detective is a professional-grade command-line tool that analyzes FLAC audio files to detect MP3-to-FLAC transcodes with high precision. Using advanced spectral analysis and an 11-rule scoring system, it helps you maintain an authentic lossless music collection.


๐Ÿ†• What's new in v0.10.0 โ€” Now with ML (May 2026)

FLAC Detective ships its first learned classifier alongside the heuristic rules. A compact CNN (~1.6 MB TorchScript, bundled with the wheel) analyses a mel-spectrogram of the file and contributes an independent score to the verdict โ€” particularly useful on the borderline cases the 11 hand-written rules miss: high-bitrate MP3 (256/320 kbps), AAC sources, Opus transcodes.

  • Opt-in via pip install "flac-detective[ml]". PyTorch and librosa are optional dependencies โ€” without them, Rule 12 is a graceful no-op and the existing 11-rule pipeline runs unchanged.
  • Trained on Hetzner GPU (RTX 4000 Ada) over 887 certified-authentic FLACs (CD rips verified by EAC / XLD / Audiochecker logs) plus 6,179 transcodes generated on the fly across 7 codec/bitrate combinations.
  • Test F1 = 91.4 %, recall 95.6 %, precision 87.5 %. Used with a conservative p โ‰ฅ 0.85 threshold so it only contributes points when it's highly confident โ€” matches FLAC Detective's "protect authentic files first" philosophy.
  • Reproducible: the full training pipeline lives in ml/ (dataset selection from your own collection's ripping logs, transcode generation, mel-spec extraction, CNN training, TorchScript export). Eight scripts, one run_pipeline.sh to chain them.

For the v0.9.7 โ†’ v0.9.11 fix trail (circular import, Docker image, documentation refresh, CLI catch-up, branch protection, โ€ฆ) see the CHANGELOG.


โœจ Key Features

  • ๐ŸŽฏ High Precision Detection: 11-rule scoring system with intelligent protection mechanisms
  • ๐Ÿ“Š 4-Level Verdict System: Clear confidence ratings from AUTHENTIC to FAKE_CERTAIN
  • โšก Performance Optimized: 80% faster than baseline through smart caching and parallel processing
  • ๐Ÿ” Advanced Analysis: Spectral analysis, compression artifact detection, and multi-segment validation
  • ๐Ÿ›ก๏ธ Protection Layers: Prevents false positives for vinyl rips, cassette transfers, and high-quality MP3s
  • ๐Ÿ“ Flexible Output: Console reports with Rich formatting, JSON export, and detailed logging
  • ๐Ÿ”ง Robust Error Handling: Automatic retries, partial file reading, and comprehensive diagnostic tracking
  • ๐Ÿ”จ Automatic Repair: Corrupted FLAC files are automatically repaired with full metadata preservation
  • ๐Ÿค– CNN classifier (optional): A small ML model bundled with the package adds a 12th scoring rule on borderline cases. pip install "flac-detective[ml]" to enable.

๐Ÿš€ Quick Start

Installation

# Install via pip (Recommended)
pip install flac-detective

# OR with the optional CNN classifier (Rule 12)
pip install "flac-detective[ml]"

# OR run with Docker (multi-arch: linux/amd64 + linux/arm64)
docker pull ghcr.io/guillain-rdcde/flac_detective:latest

๐Ÿ“ฆ See Getting Started for complete installation instructions.

Basic Usage

# Analyze current directory
flac-detective .

# Analyze specific directory
flac-detective /path/to/music

# Interactive mode (prompts for paths, accepts drag-and-drop in Windows cmd)
flac-detective

Common Options

# Show version and help
flac-detective --version
flac-detective --help

# Verbose log + JSON output to a custom path
flac-detective -v --format json --output report.json /music

# Quick scan (15 s sample instead of default 30 s)
flac-detective --sample-duration 15 /music

๐Ÿ“– See User Guide for detailed usage examples and command line options.

Try it Now (No Installation Required)

Option 1: Docker with Sample File

# Download a sample FLAC file (public domain)
curl -O https://archive.org/download/test_flac/sample.flac

# Run analysis with Docker (mount current directory)
docker run --rm -v "$(pwd)":/data ghcr.io/guillain-rdcde/flac_detective:latest /data/sample.flac

Option 2: Quick Python Test

# Using Python (if you have pip installed)
pip install flac-detective
flac-detective --version
flac-detective --help

Option 3: Interactive Demo Script โญ (Best for Quick Test)

# Clone and run demo with synthetic test files
git clone https://github.com/Guillain-RDCDE/FLAC_Detective.git
cd FLAC_Detective
pip install -e .
python examples/quick_test.py

This creates test files and shows FLAC Detective in action in 30 seconds!

Option 4: GitHub Codespaces (Fully Interactive Online)

  1. Click the "Code" button โ†’ "Codespaces" โ†’ "Create codespace"
  2. Wait for environment setup (~30 seconds)
  3. Run: pip install -e . && python examples/quick_test.py

No sample files? The tool works with any FLAC file from your music collection!


๐ŸŽฌ Demo

Live Demo

FLAC Detective in Action

Watch FLAC Detective analyze files with real-time progress bars and colored output!

Example Output

======================================================================
  FLAC AUTHENTICITY ANALYZER
  Detection of MP3s transcoded to FLAC
======================================================================

โ ‹ Analyzing audio files... โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”  15% 0:02:34

======================================================================
  ANALYSIS COMPLETE
======================================================================
  FLAC files analyzed: 245
  Authentic files: 215 (87.8%)
  Fake/Suspicious files: 12 (4.9%)
  Text report: flac_report_20251220_143022.txt
======================================================================

โšก Performance

FLAC Detective is optimized for both speed and accuracy:

  • Speed: 2-5 seconds per file (30s sample, default)
  • Throughput: 700-1,800 files/hour on modern hardware
  • Memory: ~150-300 MB peak usage
  • Optimization: 80% faster than baseline through intelligent caching and parallel processing
  • Scalability: Handles libraries with 10,000+ files efficiently

Customizable Performance:

# Faster analysis (15s per file) - good for quick scans
flac-detective /music --sample-duration 15

# Balanced (30s per file) - default, recommended
flac-detective /music

# More thorough (60s per file) - maximum accuracy
flac-detective /music --sample-duration 60

โ“ Frequently Asked Questions

Does it work on Windows/Mac/Linux?

Yes! FLAC Detective is cross-platform and works on:

  • โœ… Windows (7, 10, 11)
  • โœ… macOS (10.14+)
  • โœ… Linux (all major distributions)

How accurate is the detection?

FLAC Detective uses an 11-rule scoring system with protection layers:

  • High confidence: >95% accuracy for AUTHENTIC and FAKE_CERTAIN verdicts
  • Protection mechanisms: Prevents false positives for vinyl rips, cassette transfers, and high-quality sources
  • 4-level system: AUTHENTIC, WARNING, SUSPICIOUS, FAKE_CERTAIN for nuanced results

Will it damage or modify my files?

No! FLAC Detective is read-only by default:

  • โœ… Only analyzes files, never modifies them
  • โœ… Safe for your entire music collection
  • โœ… Optional --repair flag for corrupted files (preserves all metadata)

Can I trust the results?

Yes, but use common sense:

  • โœ… AUTHENTIC (score โ‰ค30): Very high confidence, keep the file
  • โšก WARNING (31-60): Borderline case, manual verification recommended
  • โš ๏ธ SUSPICIOUS (61-85): High confidence transcode, consider replacing
  • โŒ FAKE_CERTAIN (โ‰ฅ86): Multiple indicators, definitely a transcode

For critical decisions, use complementary tools (e.g., Spek for visual spectral analysis) to confirm.

What file formats are supported?

Currently:

  • โœ… FLAC files (.flac)
  • ๐Ÿ”œ Future: WAV, ALAC, APE (planned for v1.0)

How long does analysis take?

  • Single file: 2-5 seconds (30s sample)
  • 100 files: ~5-10 minutes
  • 1,000 files: ~50-90 minutes
  • 10,000 files: ~8-15 hours

Use --sample-duration 15 for faster scans of large libraries.

Can I use it in my own application?

Yes! FLAC Detective provides a Python API:

from flac_detective import FLACAnalyzer

analyzer = FLACAnalyzer()
result = analyzer.analyze_file("song.flac")
print(result['verdict'])  # AUTHENTIC, WARNING, SUSPICIOUS, or FAKE_CERTAIN

See examples/ directory for integration examples.

Is it free and open source?

Yes! MIT License:

  • โœ… Free for personal and commercial use
  • โœ… Open source on GitHub
  • โœ… Contributions welcome

How can I contribute?

See CONTRIBUTING.md for:

  • Bug reports and feature requests
  • Code contributions
  • Documentation improvements
  • Testing and feedback

๐Ÿ“š Documentation

Detailed documentation is available in the docs/ directory:


๐ŸŽฏ Use Cases

  • Library Maintenance: Clean your music collection of fake lossless files
  • Quality Verification: Validate FLAC authenticity before archiving
  • Batch Processing: Analyze large music libraries efficiently
  • Format Validation: Ensure genuine lossless quality for critical listening

๐Ÿ’ก Quick Examples

See the examples/ directory for ready-to-run scripts:


๐Ÿค Contributing

Contributions are welcome! Please read our CONTRIBUTING.md for detailed guidelines and CODE_OF_CONDUCT.md for community standards.


๐Ÿ”’ Security

For security policy and vulnerability reporting, please see SECURITY.md.


๐Ÿ“ License

This project is licensed under the MIT License - see the LICENSE file for details.


๐Ÿ“ž Support


๐Ÿ™ Acknowledgements

Thanks to the community members who took the time to report bugs and confirm fixes โ€” first issues are special.

  • @GearKite โ€” Filed #7 with a clean traceback that pinpointed the circular import in v0.9.6, and #6 spotting the underscore-vs-dash Docker image name.
  • @Aakiles โ€” Diagnosed the circular import end-to-end and shipped a working patch via comment. The v0.9.7 fix is a refinement of his approach.
  • @AnotherMuggle and @tomelephant-git โ€” Confirmed the fix across operating systems, including Windows 11 LTSC.
  • @AKHwyJunkie โ€” Confirmed the v0.9.6 import crash, validating @GearKite's report.
  • @pblue3 โ€” First reported the Docker image inaccessibility (#6).

โญ Star History

Star History Chart


FLAC Detective - Maintaining authentic lossless audio collections

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flac_detective-0.11.0.tar.gz (41.7 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

flac_detective-0.11.0-py3-none-any.whl (41.7 MB view details)

Uploaded Python 3

File details

Details for the file flac_detective-0.11.0.tar.gz.

File metadata

  • Download URL: flac_detective-0.11.0.tar.gz
  • Upload date:
  • Size: 41.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for flac_detective-0.11.0.tar.gz
Algorithm Hash digest
SHA256 1b1adca9817a56a9df3ff50984a22df65e64dcbcd0b97f5b881b245398a3070a
MD5 8033c2abdc4a911dc1b06c290ea1d4a9
BLAKE2b-256 8e8e6573b1fbfefea06e48f359c6ec2739bc839ade227f9fe294b135bfecfa90

See more details on using hashes here.

Provenance

The following attestation bundles were made for flac_detective-0.11.0.tar.gz:

Publisher: release.yml on Guillain-RDCDE/FLAC_Detective

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file flac_detective-0.11.0-py3-none-any.whl.

File metadata

  • Download URL: flac_detective-0.11.0-py3-none-any.whl
  • Upload date:
  • Size: 41.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for flac_detective-0.11.0-py3-none-any.whl
Algorithm Hash digest
SHA256 27376d3c21ba76bb7e4edbaa27557e101a069ee448a6fd6c3b7affc7dacfc965
MD5 e064f2061418cf38621c33bc60a23eba
BLAKE2b-256 246766c9fc90a01cd8eb6271218f7250bcbf6d26c0f95ae06a1348f4c8908d92

See more details on using hashes here.

Provenance

The following attestation bundles were made for flac_detective-0.11.0-py3-none-any.whl:

Publisher: release.yml on Guillain-RDCDE/FLAC_Detective

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page