Skip to main content

Classify and organise a music library by artist genre via MusicBrainz

Project description

Music Classifier & Cleaner

Classifies a music library by artist genre (via MusicBrainz) and reorganises folders by genre.

Installation

# Install in editable mode (recommended for development)
pip install -e .

# Or with Poetry
poetry install

Commands

Command Purpose
classify-organize Scan library, classify artists by genre via MusicBrainz, reorganise into genre folders
scan-library Scan library for artists with few songs, output a CSV for manual review
discover-from-library Process the review CSV — remove artist folders or explore top tracks via Deezer
tag-library-genres Tag all audio files with top 3 MusicBrainz genres + language tag

All commands are available system-wide after pip install -e .. Alternatively use poetry run <command>.


classify-organize — Classify and organise by genre

Scans the library root for artist folders (any top-level directory that isn't a genre folder), plus loose MP3s without a parent artist folder. For each artist:

  1. Deduplicates similar folder names using fuzzy matching (e.g. "Greenday""Green Day")
  2. Renames the folder to the canonical name, merging if the target already exists
  3. Updates the artist ID3 tag in every MP3 inside the folder to match the canonical name
  4. Queries MusicBrainz for the artist's genre tags
  5. Classifies the tags into one of the predefined genre buckets via keyword matching
  6. Moves the entire artist folder into the matching genre subfolder (or other/ if nothing matched)
classify-organize /path/to/music/library

scan-library — Scan for sparse artists

Walks every genre subfolder, counts songs per artist, and writes a CSV of artists with few songs for manual review.

  • Skips empty artist folders (prompts to delete them)
  • Also checks loose MP3s in the library root (tagged as genre "root")
  • Outputs artists_to_review.csv in the library root with columns: artist, genre, song_count, path, decision
# Default threshold: 4 songs
scan-library /path/to/music/library

# Custom threshold
scan-library /path/to/music/library -t 3

After filling in the decision column (remove or explore), process the CSV with discover-from-library.


discover-from-library — Process review decisions

Reads the CSV produced by scan-library and acts on each row:

  • remove — deletes the entire artist folder via shutil.rmtree
  • explore — looks up the artist on Deezer and prints their top 5 tracks with durations
discover-from-library /path/to/artists_to_review.csv

tag-library-genres — Tag library with genres from MusicBrainz

Recursively walks every .mp3 and .flac file in the library. For each file:

  1. Reads the artist from the file's metadata (EasyID3 for MP3, FLAC Vorbis comments for FLAC)
  2. Looks up the artist on MusicBrainz — fetches genre tags and detects language from tags
  3. Verifies the match — only tags the file if the MusicBrainz matched name matches the file's artist tag (case-insensitive). Skips if they differ (e.g. MusicBrainz returned a different artist)
  4. Writes tags — up to 3 genre tags and an ISO 639-2 language code
  5. Skips already-tagged files — checks existing tags before querying or writing

Caches MusicBrainz results per artist so files by the same artist only trigger one API lookup.

# Preview only (no files are modified)
tag-library-genres /path/to/music/library -n

# Tag all files
tag-library-genres /path/to/music/library

Tags written:

Format Genre Language
MP3 TCON frame — comma-separated string (e.g. "Swing, Jazz, Big Band") TLAN frame — ISO 639-2 code (e.g. "eng")
FLAC Multiple GENRE Vorbis comments LANGUAGE Vorbis comment

Running tests

# Install test dependencies
pip install pytest pytest-cov

# Run all tests
pytest tests/ -v

# Run with coverage report
pytest tests/ --cov=music_classifier.utils -v

# Run with coverage + line numbers of misses
pytest tests/ --cov=music_classifier.utils --cov-report=term-missing -v

# Generate HTML coverage report
pytest tests/ --cov=music_classifier.utils --cov-report=html
open htmlcov/index.html

Project structure

music-classifier-cleaner/
├── pyproject.toml          # Poetry config + console_scripts entry points
├── README.md
├── music_classifier/       # Installable Python package
│   ├── __init__.py
│   ├── utils.py            # Core utilities (MusicBrainz, tagging, classification)
│   └── cli.py              # CLI entry point functions (argparse + main)
└── tests/
    ├── __init__.py
    └── test_utils.py       # 87 tests covering non-API code paths

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

music_classifier_cleaner-0.1.0.tar.gz (11.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

music_classifier_cleaner-0.1.0-py3-none-any.whl (11.4 kB view details)

Uploaded Python 3

File details

Details for the file music_classifier_cleaner-0.1.0.tar.gz.

File metadata

  • Download URL: music_classifier_cleaner-0.1.0.tar.gz
  • Upload date:
  • Size: 11.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for music_classifier_cleaner-0.1.0.tar.gz
Algorithm Hash digest
SHA256 1dc80ea3aaa62b0ce633c5f2140439dd131c5e67faa89ac1f69fc8d67bd34354
MD5 dad8877d762f6a4b35496d744827939e
BLAKE2b-256 d10f32543336860e992a2e4dea06561c031afc859d67eb60173f4a01bf74936c

See more details on using hashes here.

File details

Details for the file music_classifier_cleaner-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for music_classifier_cleaner-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5df621d844ffea9247943c7bd2607d935d74d8ad25779276c3ccba88488c3348
MD5 370d280624a10c494461837977fdc1fe
BLAKE2b-256 76756c4911c5e5fd1e41aba77cef20c364707acc5c0b752491d846a4ec668e68

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page