Skip to main content

CLI music downloader with playlist provider support and source fallbacks

Project description

music-downloader

musicdl is a CLI for resolving playlist tracks and downloading matching audio without provider-specific API keys. It is built around three ideas: resolve playlists from multiple input types, score multiple public search results per song, and keep on-disk state consistent across repeated runs.

What It Does

  • Resolves playlists from Spotify URLs, generic supported playlist URLs, or local .txt files.
  • Downloads a single song by artist and title without needing a playlist.
  • Downloads one .mp3 per discovered song into a chosen output folder.
  • Skips songs already present in the output folder or already recorded in logs.
  • Persists download, failure, and removal state across runs.
  • Rebuilds log state from the real contents of the output directory on every run.

How It Works

The CLI flow is:

  1. Parse CLI arguments and create the output directory.
  2. Sync log files with the current .mp3 files on disk.
  3. Resolve the input into a normalized Playlist of Song records.
  4. For each song, try multiple public search backends until one candidate is accepted.
  5. Download the chosen source with yt-dlp, rename the file, and update logs.

For a deeper code-level overview, see docs/architecture.md.

Requirements

  • Python 3.10+
  • yt-dlp available in PATH
  • Network access to the source platform being resolved

Installation

Install from PyPI:

pip install will0w-musicdl

Or install in editable mode for development:

pip install -e .

This exposes the musicdl command defined in pyproject.toml.

CLI Usage

Basic usage:

musicdl [url-or-path]

Typical examples:

musicdl "https://open.spotify.com/playlist/..." -o ~/Music/MyPlaylist
musicdl "https://music.youtube.com/playlist?..." -o ./downloads
musicdl ./my-songs.txt -o ./downloads

Download a single song by artist and title:

musicdl --artist 'Daft Punk' --title 'One More Time' -o ~/Music
musicdl --artist 'Daft Punk' --title 'One More Time' --album 'Discovery' -o ~/Music

Download a single song from a direct URL:

musicdl 'https://www.youtube.com/watch?v=FGBhQbmPwH8' -o ~/Music
musicdl 'https://soundcloud.com/artist/track' -o ~/Music

Provide metadata when downloading from a URL (overrides provider metadata for the supplied fields):

musicdl 'https://www.youtube.com/watch?v=FGBhQbmPwH8' --artist 'Daft Punk' --title 'One More Time' --album 'Discovery' -o ~/Music

Interactively search for a song and pick which result to download:

musicdl --search 10 --artist 'Daft Punk' --title 'One More Time' -o ~/Music
musicdl --search 5 --title 'feminine urge'
musicdl --search 8 --artist 'Radiohead'

This displays a numbered table of candidates from YouTube Music, YouTube, and SoundCloud. Enter a number to download that result, or q to quit.

If you rerun a playlist against an existing output folder, matching .mp3 files are not redownloaded. Instead, musicdl uses the resolved playlist song metadata to backfill only missing tags such as album, track number, release date, ISRC, artwork, and genres.

Retry only previously failed songs for an output folder:

musicdl --download-failed -o ./downloads

Export metadata for all downloaded songs in a folder:

musicdl --metadata-folder ./downloads
musicdl --metadata-folder ./downloads --metadata-output ./downloads/all-metadata.json

Write artist/title tags into existing MP3 files in a folder:

musicdl --tag-folder ./downloads
musicdl --tag-folder ./downloads --tag-metadata ./downloads/metadata.json

If --tag-metadata is omitted, musicdl will automatically try metadata.json and then downloaded.json inside the target folder. If neither exists, it will scan existing tags/filenames and then attempt online enrichment by artist/title to fill album, release date, artwork, source URL, and genres when a strong catalog match is found.

Process a limited number of songs while testing matcher changes:

musicdl "https://open.spotify.com/playlist/..." --max-songs 10 -o ./downloads

Arguments

  • playlist: Playlist URL, single song URL, or path to a .txt playlist file.
  • -o, --output: Destination folder for downloaded files and log files. Defaults to the current directory.
  • --max-songs: Optional cap for the current run. 0 means no cap.
  • --download-failed: Retry only songs currently listed in failed.json for the chosen output folder.
  • --metadata-folder: Scan all .mp3 files in a folder and export metadata as JSON.
  • --metadata-output: Optional output path for metadata JSON. Defaults to metadata.json inside --metadata-folder.
  • --tag-folder: Write ID3 artist/title tags for all .mp3 files in a folder.
  • --tag-metadata: Optional metadata JSON input used by --tag-folder to also apply album/track fields.
  • --artist: Artist name for single-song download, or metadata override when used with a URL. Requires --title when used without a URL.
  • --title: Track title for single-song download, or metadata override when used with a URL. Requires --artist when used without a URL.
  • --album: Optional album name. Can be used with --artist/--title or with a URL to set the album field.
  • --search N: Search for N candidates and interactively pick one to download. Requires at least --artist or --title. Cannot be combined with a playlist or --download-failed.

Supported Inputs

Single song by artist and title

A single song can be downloaded by specifying --artist and --title. The optional --album flag attaches album metadata to the resulting file. This bypasses playlist resolution entirely and feeds the song straight into the download engine.

Single song URL

A direct URL to a song (e.g. a YouTube video or SoundCloud track) can be passed as the positional argument. The URL is resolved through the same provider pipeline as playlists. When the provider returns a single track, it is downloaded directly from that URL rather than searching for it. The --artist, --title, and --album flags can be combined with a URL to override the metadata extracted by the provider — useful when the video title or uploader name doesn't match the actual song.

Spotify playlists

Spotify playlists are resolved through a native provider that uses Spotify's public web surfaces. The implementation supports pagination and avoids the common 100-track truncation issue.

Generic URLs

Other URLs (playlists or single tracks) are resolved through yt-dlp --flat-playlist --dump-single-json. This covers platforms that yt-dlp already knows how to inspect. When the URL points to a single track, the metadata is extracted from the top-level response and the song is downloaded directly from the source URL.

Text playlists

The .txt provider accepts one song per line in Artist - Title format:

Artist One - Song One
Artist Two - Song Two
# comments are ignored

Blank lines and comment lines are ignored.

Download and Matching Strategy

After a playlist is resolved, each song is searched across multiple backends in this order:

  1. YouTube Music style query (ytsearch5 with a topic suffix)
  2. Standard YouTube search (ytsearch5)
  3. SoundCloud search (scsearch3)

Candidate scoring attempts to prefer track-like uploads and reject obviously wrong variants. Examples of protected cases include:

  • club mix, acoustic, remaster, remix, karaoke, nightcore, and similar variants
  • preview URLs and podcast-like results
  • music videos when a cleaner topic/audio source is available
  • wrong franchise/theme matches for generic titles such as Theme (From ...)

The engine stops on the first backend that yields an accepted download.

Detailed matcher regression history lives in docs/song-regression-playbook.md.

Output Files and Persistent State

Each output directory contains the audio files plus JSON state files:

  • downloaded.json: songs currently known as downloaded
  • failed.json: songs that failed all attempted sources
  • removed.json: songs that were once downloaded but no longer exist on disk

State behavior

  • If a song downloads successfully, it is upserted into downloaded.json.
  • Both downloaded.json and failed.json persist the song source URL when the provider exposes it.
  • Successful downloads write ID3 tags from provider metadata (artist, title, album, tracknumber, date, genre, ISRC, and cover art when available).
  • If you rerun the same playlist into the same folder, songs that are already present are matched from disk and have only their missing tags backfilled from the playlist metadata.
  • If a song fails all search/download attempts, it is recorded in failed.json.
  • If a file exists in the output folder but is missing from the logs, it is reconstructed into downloaded.json when the filename matches Artist - Title.mp3.
  • If a file was previously tracked in downloaded.json but no longer exists on disk, it is moved into removed.json.

This design keeps the output folder as the source of truth rather than trusting stale logs.

Operational Notes

  • No platform API keys are required.
  • Private, missing, geo-restricted, or paywalled playlists are surfaced as explicit errors when detectable.
  • Actual audio correctness still depends on public search quality and available metadata.
  • Manually added files are only mapped automatically when filenames follow Artist - Title.mp3.

Cron Automation

This repo includes music-cron.sh, a project-local shell script that activates the virtual environment and runs several curated playlist sync jobs. It is intentionally simple and relies on absolute paths so it can be called from cron without inheriting a shell session.

If you adapt it for another machine, update:

  • the repository path
  • the virtual environment path
  • the playlist URLs
  • the output directories

Project Layout

src/musicdl/
	cli.py                  CLI entrypoint and orchestration
	fs.py                   state file loading/saving/synchronization
	types.py                core dataclasses and normalization helpers
	downloader/engine.py    song search, candidate scoring, downloads
	providers/              playlist input resolvers
tests/
	test_engine_matching.py matcher regression coverage
	test_fs_logs.py         output-state and log synchronization tests
	test_txt_provider.py    text playlist parsing tests
docs/
	architecture.md         component and data-flow overview
	song-regression-playbook.md

Development

Run the full test suite:

pytest

Recommended workflow for matcher changes:

  1. Read docs/song-regression-playbook.md.
  2. Update or add focused matcher tests.
  3. Run pytest.
  4. Manually sanity-check one or two representative songs.

Known Limitations

  • Filename-based reconstruction is intentionally conservative and depends on a stable naming convention.
  • The downloader produces .mp3 output and does not currently expose format configuration through the CLI.
  • Provider coverage for non-Spotify URLs depends entirely on yt-dlp extractor support.

Changelog

See CHANGELOG.md for a full list of changes in each version.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

will0w_musicdl-0.2.0.tar.gz (50.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

will0w_musicdl-0.2.0-py3-none-any.whl (41.2 kB view details)

Uploaded Python 3

File details

Details for the file will0w_musicdl-0.2.0.tar.gz.

File metadata

  • Download URL: will0w_musicdl-0.2.0.tar.gz
  • Upload date:
  • Size: 50.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for will0w_musicdl-0.2.0.tar.gz
Algorithm Hash digest
SHA256 44259e384ab77b96984b0fd321b42d869cf1a72653d3071e01d6c3a739c3b532
MD5 8134eab0a1d9c1f40f7bb343c02cc909
BLAKE2b-256 610add54618fa00c195147bd577b6932aef198552a3839902964cd67ccdc3ee0

See more details on using hashes here.

File details

Details for the file will0w_musicdl-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: will0w_musicdl-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 41.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for will0w_musicdl-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b36d3a757bf609864ae7b05c9ee4bdeceb5532b9b826985bc722da54e5e8ba88
MD5 acbbfe7ad9413f64fef033cff326d37d
BLAKE2b-256 a3478ca863b2a74dde8714ba68d4adc2a0b4de553ea195cf6a12ce3f5b596aba

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page