CLI music downloader with playlist provider support and source fallbacks
Project description
music-downloader
musicdl is a CLI for resolving playlist tracks and downloading matching audio without provider-specific API keys. It is built around three ideas: resolve playlists from multiple input types, score multiple public search results per song, and keep on-disk state consistent across repeated runs.
What It Does
- Resolves playlists from Spotify URLs, generic supported playlist URLs, or local
.txtfiles. - Downloads a single song by artist and title without needing a playlist.
- Downloads one
.mp3per discovered song into a chosen output folder. - Skips songs already present in the output folder or already recorded in logs.
- Persists download, failure, and removal state across runs.
- Rebuilds log state from the real contents of the output directory on every run.
How It Works
The CLI flow is:
- Parse CLI arguments and create the output directory.
- Sync log files with the current
.mp3files on disk. - Resolve the input into a normalized
PlaylistofSongrecords. - For each song, try multiple public search backends until one candidate is accepted.
- Download the chosen source with
yt-dlp, rename the file, and update logs.
For a deeper code-level overview, see docs/architecture.md.
Requirements
- Python 3.10+
yt-dlpavailable inPATH- Network access to the source platform being resolved
Installation
Install the project in editable mode:
pip install -e .
This exposes the musicdl command defined in pyproject.toml.
CLI Usage
Basic usage:
musicdl [url-or-path]
Typical examples:
musicdl "https://open.spotify.com/playlist/..." -o ~/Music/MyPlaylist
musicdl "https://music.youtube.com/playlist?..." -o ./downloads
musicdl ./my-songs.txt -o ./downloads
Download a single song by artist and title:
musicdl --artist 'Daft Punk' --title 'One More Time' -o ~/Music
musicdl --artist 'Daft Punk' --title 'One More Time' --album 'Discovery' -o ~/Music
Download a single song from a direct URL:
musicdl 'https://www.youtube.com/watch?v=FGBhQbmPwH8' -o ~/Music
musicdl 'https://soundcloud.com/artist/track' -o ~/Music
Provide metadata when downloading from a URL (overrides provider metadata for the supplied fields):
musicdl 'https://www.youtube.com/watch?v=FGBhQbmPwH8' --artist 'Daft Punk' --title 'One More Time' --album 'Discovery' -o ~/Music
If you rerun a playlist against an existing output folder, matching .mp3 files are
not redownloaded. Instead, musicdl uses the resolved playlist song metadata to
backfill only missing tags such as album, track number, release date, ISRC,
artwork, and genres.
Retry only previously failed songs for an output folder:
musicdl --download-failed -o ./downloads
Export metadata for all downloaded songs in a folder:
musicdl --metadata-folder ./downloads
musicdl --metadata-folder ./downloads --metadata-output ./downloads/all-metadata.json
Write artist/title tags into existing MP3 files in a folder:
musicdl --tag-folder ./downloads
musicdl --tag-folder ./downloads --tag-metadata ./downloads/metadata.json
If --tag-metadata is omitted, musicdl will automatically try metadata.json
and then downloaded.json inside the target folder. If neither exists, it will
scan existing tags/filenames and then attempt online enrichment by artist/title
to fill album, release date, artwork, source URL, and genres when a strong
catalog match is found.
Process a limited number of songs while testing matcher changes:
musicdl "https://open.spotify.com/playlist/..." --max-songs 10 -o ./downloads
Arguments
playlist: Playlist URL, single song URL, or path to a.txtplaylist file.-o,--output: Destination folder for downloaded files and log files. Defaults to the current directory.--max-songs: Optional cap for the current run.0means no cap.--download-failed: Retry only songs currently listed infailed.jsonfor the chosen output folder.--metadata-folder: Scan all.mp3files in a folder and export metadata as JSON.--metadata-output: Optional output path for metadata JSON. Defaults tometadata.jsoninside--metadata-folder.--tag-folder: Write ID3 artist/title tags for all.mp3files in a folder.--tag-metadata: Optional metadata JSON input used by--tag-folderto also apply album/track fields.--artist: Artist name for single-song download, or metadata override when used with a URL. Requires--titlewhen used without a URL.--title: Track title for single-song download, or metadata override when used with a URL. Requires--artistwhen used without a URL.--album: Optional album name. Can be used with--artist/--titleor with a URL to set the album field.
Supported Inputs
Single song by artist and title
A single song can be downloaded by specifying --artist and --title. The optional --album flag attaches album metadata to the resulting file. This bypasses playlist resolution entirely and feeds the song straight into the download engine.
Single song URL
A direct URL to a song (e.g. a YouTube video or SoundCloud track) can be passed as the positional argument. The URL is resolved through the same provider pipeline as playlists. When the provider returns a single track, it is downloaded directly from that URL rather than searching for it. The --artist, --title, and --album flags can be combined with a URL to override the metadata extracted by the provider — useful when the video title or uploader name doesn't match the actual song.
Spotify playlists
Spotify playlists are resolved through a native provider that uses Spotify's public web surfaces. The implementation supports pagination and avoids the common 100-track truncation issue.
Generic URLs
Other URLs (playlists or single tracks) are resolved through yt-dlp --flat-playlist --dump-single-json. This covers platforms that yt-dlp already knows how to inspect. When the URL points to a single track, the metadata is extracted from the top-level response and the song is downloaded directly from the source URL.
Text playlists
The .txt provider accepts one song per line in Artist - Title format:
Artist One - Song One
Artist Two - Song Two
# comments are ignored
Blank lines and comment lines are ignored.
Download and Matching Strategy
After a playlist is resolved, each song is searched across multiple backends in this order:
- YouTube Music style query (
ytsearch5with atopicsuffix) - Standard YouTube search (
ytsearch5) - SoundCloud search (
scsearch3)
Candidate scoring attempts to prefer track-like uploads and reject obviously wrong variants. Examples of protected cases include:
- club mix, acoustic, remaster, remix, karaoke, nightcore, and similar variants
- preview URLs and podcast-like results
- music videos when a cleaner topic/audio source is available
- wrong franchise/theme matches for generic titles such as
Theme (From ...)
The engine stops on the first backend that yields an accepted download.
Detailed matcher regression history lives in docs/song-regression-playbook.md.
Output Files and Persistent State
Each output directory contains the audio files plus JSON state files:
downloaded.json: songs currently known as downloadedfailed.json: songs that failed all attempted sourcesremoved.json: songs that were once downloaded but no longer exist on disk
State behavior
- If a song downloads successfully, it is upserted into
downloaded.json. - Both
downloaded.jsonandfailed.jsonpersist the song source URL when the provider exposes it. - Successful downloads write ID3 tags from provider metadata (
artist,title,album,tracknumber,date,genre,ISRC, and cover art when available). - If you rerun the same playlist into the same folder, songs that are already present are matched from disk and have only their missing tags backfilled from the playlist metadata.
- If a song fails all search/download attempts, it is recorded in
failed.json. - If a file exists in the output folder but is missing from the logs, it is reconstructed into
downloaded.jsonwhen the filename matchesArtist - Title.mp3. - If a file was previously tracked in
downloaded.jsonbut no longer exists on disk, it is moved intoremoved.json.
This design keeps the output folder as the source of truth rather than trusting stale logs.
Operational Notes
- No platform API keys are required.
- Private, missing, geo-restricted, or paywalled playlists are surfaced as explicit errors when detectable.
- Actual audio correctness still depends on public search quality and available metadata.
- Manually added files are only mapped automatically when filenames follow
Artist - Title.mp3.
Cron Automation
This repo includes music-cron.sh, a project-local shell script that activates the virtual environment and runs several curated playlist sync jobs. It is intentionally simple and relies on absolute paths so it can be called from cron without inheriting a shell session.
If you adapt it for another machine, update:
- the repository path
- the virtual environment path
- the playlist URLs
- the output directories
Project Layout
src/musicdl/
cli.py CLI entrypoint and orchestration
fs.py state file loading/saving/synchronization
types.py core dataclasses and normalization helpers
downloader/engine.py song search, candidate scoring, downloads
providers/ playlist input resolvers
tests/
test_engine_matching.py matcher regression coverage
test_fs_logs.py output-state and log synchronization tests
test_txt_provider.py text playlist parsing tests
docs/
architecture.md component and data-flow overview
song-regression-playbook.md
Development
Run the full test suite:
pytest
Recommended workflow for matcher changes:
- Read
docs/song-regression-playbook.md. - Update or add focused matcher tests.
- Run
pytest. - Manually sanity-check one or two representative songs.
Known Limitations
- Filename-based reconstruction is intentionally conservative and depends on a stable naming convention.
- The downloader produces
.mp3output and does not currently expose format configuration through the CLI. - Provider coverage for non-Spotify URLs depends entirely on
yt-dlpextractor support.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file will0w_musicdl-0.1.0.tar.gz.
File metadata
- Download URL: will0w_musicdl-0.1.0.tar.gz
- Upload date:
- Size: 48.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fbb1bf894ac362c9f949d8fd4e309410c254d7f3e718b57e5acc7bffe03e9ffd
|
|
| MD5 |
47d24552869f2ff0193dd37e365daa17
|
|
| BLAKE2b-256 |
084c96be9be3ca00470d8a9eec3f1342e0d673cb84cc14ec3c3b96f1467488b9
|
File details
Details for the file will0w_musicdl-0.1.0-py3-none-any.whl.
File metadata
- Download URL: will0w_musicdl-0.1.0-py3-none-any.whl
- Upload date:
- Size: 39.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fa0ef31a814062695af102bd5c9c34335c4d2a9d42a918465e6c07d76029609e
|
|
| MD5 |
9e905f006968446ecb078122e513517b
|
|
| BLAKE2b-256 |
fb737514d07ab31a258a3bc3a1e061b37b65aada1f27824de71b25b2a00ed91c
|