Skip to main content

Advanced FLAC authenticity analyzer - Detects MP3-to-FLAC transcodes with high precision

Project description

๐ŸŽต FLAC Detective

FLAC Detective Banner

Python Version PyPI version Documentation Status License Status Coverage Badge codecov Code style: black Pre-commit

Advanced FLAC Authenticity Analyzer for Detecting MP3-to-FLAC Transcodes

FLAC Detective is a professional-grade command-line tool that analyzes FLAC audio files to detect MP3-to-FLAC transcodes with high precision. Using advanced spectral analysis and an 11-rule scoring system, it helps you maintain an authentic lossless music collection.


โœจ Key Features

  • ๐ŸŽฏ High Precision Detection: 11-rule scoring system with intelligent protection mechanisms
  • ๐Ÿ“Š 4-Level Verdict System: Clear confidence ratings from AUTHENTIC to FAKE_CERTAIN
  • โšก Performance Optimized: 80% faster than baseline through smart caching and parallel processing
  • ๐Ÿ” Advanced Analysis: Spectral analysis, compression artifact detection, and multi-segment validation
  • ๐Ÿ›ก๏ธ Protection Layers: Prevents false positives for vinyl rips, cassette transfers, and high-quality MP3s
  • ๐Ÿ“ Flexible Output: Console reports with Rich formatting, JSON export, and detailed logging
  • ๐Ÿ”ง Robust Error Handling: Automatic retries, partial file reading, and comprehensive diagnostic tracking
  • ๐Ÿ”จ Automatic Repair: Corrupted FLAC files are automatically repaired with full metadata preservation

๐Ÿš€ Quick Start

Installation

Option 1: Install via pip (Recommended)

pip install flac-detective

Option 2: Run with Docker

# Pull from GitHub Container Registry
docker pull ghcr.io/guillainm/flac-detective:latest

# Analyze files
docker run --rm -v /path/to/audio:/data ghcr.io/guillainm/flac-detective:latest /data

๐Ÿ“ฆ See Getting Started for complete installation and usage documentation.

Basic Usage

Command Line

# Analyze current directory
flac-detective .

# Analyze specific directory
flac-detective /path/to/music

# Generate JSON report
flac-detective /path/to/music --format json

# Verbose output with detailed analysis
flac-detective /path/to/music --verbose

Docker

# Analyze a directory
docker run --rm -v /path/to/audio:/data ghcr.io/guillainm/flac-detective:latest /data

# With repair enabled
docker run --rm -v /path/to/audio:/data ghcr.io/guillainm/flac-detective:latest /data --repair

# Generate JSON report
docker run --rm -v /path/to/audio:/data ghcr.io/guillainm/flac-detective:latest /data --format json > report.json

๐Ÿ“– How It Works

Detection Rules

FLAC Detective uses 11 independent rules with additive scoring (0-150 points):

Rule Description Points
Rule 1 MP3 Spectral Signature (CBR patterns) +50
Rule 2 Cutoff Frequency Analysis +50
Rule 3 Bitrate Inflation Detection +50
Rule 4 Suspicious 24-bit Detection +30
Rule 5 High Variance Protection (VBR) -40
Rule 6 High Quality Protection -30
Rule 7 Vinyl & Silence Analysis -100
Rule 8 Nyquist Exception -50
Rule 9 Compression Artifacts +30
Rule 10 Multi-Segment Consistency Variable
Rule 11 Cassette Detection -60

Verdict System

Based on the total score, FLAC Detective assigns one of four verdicts:

Score โ‰ค 30   โ†’ โœ… AUTHENTIC      (High confidence - genuine lossless)
Score 31-60  โ†’ โšก WARNING        (Manual review recommended)
Score 61-85  โ†’ โš ๏ธ  SUSPICIOUS    (Likely transcode)
Score โ‰ฅ 86   โ†’ โŒ FAKE_CERTAIN   (Definite transcode)

Protection Mechanisms

The tool implements a multi-layer protection system to prevent false positives:

  1. Absolute Protection (Rule 8): Protects files with cutoff near Nyquist frequency
  2. MP3 320k Protection (Rule 1): Exception for high-quality MP3 320 kbps
  3. Analog Source Protection (Rules 7, 11): Detects vinyl rips and cassette transfers
  4. Dynamic Protection (Rule 10): Validates consistency across file segments

๐Ÿ†• What's New in v0.9.0

Complete Project Restructuring and Documentation Overhaul

  • Professional Documentation Structure: Reorganized all documentation into audience-specific directories (user-guide, technical, reference, development, automation, ci-cd)
  • Comprehensive Navigation: Added PROJECT_OVERVIEW.md and DOCUMENTATION_GUIDE.md for easy navigation
  • Clean Root Directory: Removed 9+ temporary implementation files and build artifacts
  • 113 Total Changes: 78 new files added, professional project structure, production-ready organization
  • Enhanced Discoverability: Clear separation between user docs, technical docs, API reference, and developer guides

Complete CI/CD Automation

  • GitHub Actions Workflows: Automated testing, building, security scanning, and releases
  • Docker Support: Pre-built images on GitHub Container Registry
  • Security Scanning: CodeQL, Bandit, Safety, pip-audit
  • Automated Releases: PyPI publishing via Trusted Publishers
  • Performance Benchmarking: Automated performance regression detection
  • Code Quality: Pre-commit hooks, coverage reporting, linting

Community Standards

  • CODE_OF_CONDUCT.md: Community guidelines and standards
  • CONTRIBUTING.md: Comprehensive contribution guide
  • SECURITY.md: Security policy and vulnerability reporting
  • Issue Templates: Bug reports, feature requests, performance issues, documentation, questions
  • Pull Request Template: Structured PR workflow

For previous releases, see CHANGELOG.md


๐Ÿ’ป Usage Examples

Command Line

# Basic analysis
flac-detective /path/to/music

# Save report to file
flac-detective /path/to/music --output report.txt

# JSON output for automation
flac-detective /path/to/music --format json > results.json

# Verbose mode with detailed rule execution
flac-detective /path/to/music --verbose

Python API

from flac_detective import FLACAnalyzer
from pathlib import Path

# Create analyzer
analyzer = FLACAnalyzer(sample_duration=30.0)

# Analyze a file
result = analyzer.analyze_file(Path('song.flac'))

print(f"Verdict: {result['verdict']}")
print(f"Score: {result['score']}/100")
print(f"Reason: {result['reason']}")

๐Ÿ“š See API Reference for complete Python API documentation


๐Ÿ“ฆ Requirements

Python Dependencies

  • Python 3.8 or higher
  • numpy >= 1.20.0
  • scipy >= 1.7.0
  • mutagen >= 1.45.0
  • soundfile >= 0.10.0
  • rich >= 13.0.0

Optional System Dependencies

The flac command-line tool is recommended for advanced features:

Linux (Debian/Ubuntu):

sudo apt-get install flac

macOS:

brew install flac

Windows: Download from Xiph.org FLAC


๐Ÿ—๏ธ Development

Installation from Source

# Clone the repository
git clone https://github.com/GuillainM/FLAC_Detective.git
cd FLAC_Detective

# Create virtual environment
python -m venv venv

# Activate virtual environment
# Linux/macOS:
source venv/bin/activate
# Windows:
venv\Scripts\activate

# Install in development mode
pip install -e ".[dev]"

Running Tests

# Run all tests
pytest

# Run with coverage report
pytest --cov=flac_detective --cov-report=html

# Run specific test file
pytest tests/test_new_scoring_rules.py -v

Version Management & Releases

FLAC Detective uses Commitizen for automated changelog generation and version management.

# Install pre-commit hooks (includes commit message validation)
pre-commit install --hook-type commit-msg

# Create a conventional commit interactively
cz commit

# Bump version and update CHANGELOG automatically
cz bump --changelog

# Or use the helper script
python scripts/bump_version.py --dry-run  # Preview changes
python scripts/bump_version.py --push     # Bump and push to trigger release

All commits must follow the Conventional Commits format:

  • feat: - New features (bumps MINOR version)
  • fix: - Bug fixes (bumps PATCH version)
  • docs: - Documentation changes
  • refactor: - Code refactoring
  • perf: - Performance improvements

See Changelog Automation Guide for detailed documentation on version management.

Project Structure

src/flac_detective/
โ”œโ”€โ”€ analysis/
โ”‚   โ”œโ”€โ”€ new_scoring/          # 11-rule scoring system
โ”‚   โ”‚   โ”œโ”€โ”€ rules/            # Individual rule implementations
โ”‚   โ”‚   โ”œโ”€โ”€ calculator.py     # Score orchestration
โ”‚   โ”‚   โ””โ”€โ”€ verdict.py        # Score interpretation
โ”‚   โ”œโ”€โ”€ spectrum.py           # Spectral analysis
โ”‚   โ””โ”€โ”€ audio_cache.py        # Optimized file reading
โ”œโ”€โ”€ reporting/                # Report generation
โ””โ”€โ”€ main.py                   # CLI entry point

๐Ÿ“š Documentation

Complete documentation is available in the docs/ directory:


๐ŸŽฏ Use Cases

โœ… Ideal For

  • Library Maintenance: Clean your music collection of fake lossless files
  • Quality Verification: Validate FLAC authenticity before archiving
  • Batch Processing: Analyze large music libraries efficiently
  • Format Validation: Ensure genuine lossless quality for critical listening

โš ๏ธ Limitations

  • Only analyzes FLAC files (other lossless formats not supported)
  • Designed for batch analysis, not real-time processing
  • Detects transcodes, not subjective audio quality
  • May require manual review for edge cases (WARNING verdicts)

๐Ÿค Contributing

Contributions are welcome! Please read our CONTRIBUTING.md for detailed guidelines and CODE_OF_CONDUCT.md for community standards.

๐Ÿ“‹ Issue Templates

We provide templates for different types of contributions:

  1. ๐Ÿ› Bug Report: Report bugs or unexpected behavior
  2. โœจ Feature Request: Suggest new features or enhancements
  3. โšก Performance Issue: Report slow performance or resource issues
  4. ๐Ÿ“ Documentation Issue: Report documentation problems
  5. โ“ Question: Ask questions about usage

View Issue Templates Guide for detailed information.

How to Contribute

  1. Report Issues: Use the appropriate issue template
  2. Suggest Features: Submit a feature request
  3. Start Discussions: Join GitHub Discussions
  4. Submit PRs: Read CONTRIBUTING.md first, then fork the repo, create a feature branch, and submit a pull request
  5. Improve Docs: Documentation improvements are always appreciated

Community Guidelines

Please follow our Code of Conduct to maintain a welcoming and inclusive environment for all contributors.

Development Workflow

# Fork and clone the repository
git clone https://github.com/YOUR_USERNAME/FLAC_Detective.git
cd FLAC_Detective

# Install development dependencies
pip install -e ".[dev]"

# Set up pre-commit hooks for code quality
python scripts/setup_precommit.py
# Or manually: pre-commit install

# Create feature branch
git checkout -b feature/amazing-feature

# Make changes and run tests
pytest tests/unit/ -v                    # Unit tests
pytest tests/integration/ -v             # Integration tests
pytest --cov=flac_detective              # With coverage

# Code quality checks (runs automatically on commit via pre-commit hooks)
pre-commit run --all-files               # Run all checks manually
black src tests                          # Format code
isort src tests                          # Sort imports
flake8 src tests                         # Lint code
mypy src                                 # Type check

# Commit and push (pre-commit hooks run automatically)
git commit -m "Add amazing feature"
git push origin feature/amazing-feature

# Open Pull Request on GitHub

Python Version Requirements:

  • Supported: Python 3.8 - 3.12
  • Testing: Use Python 3.8-3.12 for running tests (scipy/numpy compatibility)

Running Tests:

# Run all unit tests
pytest tests/unit/ -v

# Run integration tests
pytest tests/integration/ -v

# Run with coverage report
pytest --cov=flac_detective --cov-report=html

# See tests/TESTING_STATUS.md for detailed testing guide

๐Ÿ”’ Security

Security is a priority for FLAC Detective. We use multiple automated tools to ensure code and dependency security.

Security Features

  • ๐Ÿ›ก๏ธ Dependabot: Automated dependency updates for security patches
  • ๐Ÿ” CodeQL: Static code analysis for vulnerability detection
  • ๐Ÿšจ Bandit: Python security linter
  • ๐Ÿ“ฆ Safety & Pip-audit: Dependency vulnerability scanners
  • ๐Ÿ“‹ Security Policy: Responsible disclosure process

Reporting Vulnerabilities

Please do NOT report security vulnerabilities through public GitHub issues.

Email security issues to: guillain@poulpe.us

See SECURITY.md for:

  • Supported versions
  • Reporting guidelines
  • Security best practices
  • Vulnerability disclosure process

Security Documentation


๐Ÿ“ License

This project is licensed under the MIT License - see the LICENSE file for details.


๐Ÿ™ Acknowledgments

  • Audio analysis community for MP3 compression research
  • Contributors to NumPy, SciPy, and Soundfile libraries
  • Beta testers and community feedback

๐Ÿ“ž Support


FLAC Detective v0.9.0 - Maintaining authentic lossless audio collections

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flac_detective-0.9.1.tar.gz (237.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

flac_detective-0.9.1-py3-none-any.whl (105.7 kB view details)

Uploaded Python 3

File details

Details for the file flac_detective-0.9.1.tar.gz.

File metadata

  • Download URL: flac_detective-0.9.1.tar.gz
  • Upload date:
  • Size: 237.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for flac_detective-0.9.1.tar.gz
Algorithm Hash digest
SHA256 e2d1b13c3dea084baae2688340a3a73b0c81af9405788f994ec2ef23fa08511d
MD5 84f91f20cae1139c60441ff4bdccfbcd
BLAKE2b-256 a44d4c07d68098ad1651ac412fefb9b0b05fd588486725a113e871da8073cfdc

See more details on using hashes here.

File details

Details for the file flac_detective-0.9.1-py3-none-any.whl.

File metadata

  • Download URL: flac_detective-0.9.1-py3-none-any.whl
  • Upload date:
  • Size: 105.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for flac_detective-0.9.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ec95c2c7e43104992ba7731ec7df46f18ddd1cec1dfacf7b9085dfe15a855986
MD5 e87fb384df66b3ed3b22808a798afe4a
BLAKE2b-256 bc0efa98fdde328e619a1fb82b3c9e55e052f3689bc2113719be6ad87fa6c19b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page