Python library for extracting data from Spotify without authentication
Project description
SpotifyScraper
Extract Spotify data without the official API. Access tracks, albums, artists, playlists, and podcasts - no authentication required.
Why SpotifyScraper?
- 🔓 No API Key Required - Start extracting data immediately
- 🚀 Fast & Lightweight - Optimized for speed and minimal dependencies
- 📊 Complete Metadata - Get all available track, album, artist details
- 🎙️ Podcast Support - Extract podcast episodes and show information
- 💿 Media Downloads - Download cover art and preview clips
- 🔄 Bulk Operations - Process multiple URLs efficiently
- 🛡️ Robust & Reliable - Comprehensive error handling and retries
Installation
# Basic installation
pip install spotifyscraper
# With Selenium support (includes automatic driver management)
pip install spotifyscraper[selenium]
# All features
pip install spotifyscraper[all]
Quick Start
Basic Usage
from spotify_scraper import SpotifyClient
# Initialize client with rate limiting (default 0.5s between requests)
client = SpotifyClient()
# Get track info with enhanced metadata
track = client.get_track_info("https://open.spotify.com/track/4iV5W9uYEdYUVa79Axb7Rh")
print(f"{track['name']} by {track['artists'][0]['name']}")
# Output: One More Time by Daft Punk
# Access new fields (when available)
print(f"Track #{track.get('track_number', 'N/A')} on disc {track.get('disc_number', 'N/A')}")
print(f"Popularity: {track.get('popularity', 'Not available')}")
# Download cover art
cover_path = client.download_cover("https://open.spotify.com/track/4iV5W9uYEdYUVa79Axb7Rh")
print(f"Cover saved to: {cover_path}")
client.close()
CLI Usage
# Get track info
spotify-scraper track https://open.spotify.com/track/4iV5W9uYEdYUVa79Axb7Rh
# Download album with covers
spotify-scraper download album https://open.spotify.com/album/0JGOiO34nwfUdDrD612dOp --with-covers
# Export playlist to JSON
spotify-scraper playlist https://open.spotify.com/playlist/37i9dQZF1DXcBWIGoYBM5M --output playlist.json
Important Notes
Field Availability
Not all fields shown in Spotify's API documentation are available via web scraping:
- ❌ NOT Available: popularity, followers, genres, detailed statistics
- ✅ Available: name, artists, album info, duration, preview URLs
- ⚠️ Authentication Required: lyrics (needs OAuth, not just cookies)
Core Features
🎵 Track Information
# Get complete track metadata
track = client.get_track_info(track_url)
# Available data:
# - name, id, uri, duration_ms
# - artists (with names, IDs, and verification status)
# - album (with name, ID, release date, images, total_tracks)
# - preview_url (30-second MP3)
# - is_explicit, is_playable
# - track_number, disc_number (when available)
# - popularity (when available)
# - external URLs
# Example: Access enhanced metadata
if 'artists' in track:
for artist in track['artists']:
print(f"Artist: {artist['name']}")
if 'verified' in artist:
print(f" Verified: {artist['verified']}")
if 'url' in artist:
print(f" URL: {artist['url']}")
if 'album' in track:
album = track['album']
print(f"Album: {album['name']} ({album.get('total_tracks', 'N/A')} tracks)")
# Note: Lyrics require OAuth authentication
# SpotifyScraper cannot access lyrics as Spotify requires Bearer tokens
💿 Album Information
# Get album with all tracks
album = client.get_album_info(album_url)
print(f"Album: {album.get('name', 'Unknown')}")
print(f"Artist: {(album.get('artists', [{}])[0].get('name', 'Unknown') if album.get('artists') else 'Unknown')}")
print(f"Released: {album.get('release_date', 'N/A')}")
print(f"Tracks: {album.get('total_tracks', 0)}")
# List all tracks
for track in album['tracks']:
print(f" {track['track_number']}. {track.get('name', 'Unknown')}")
👤 Artist Information
# Get artist profile
artist = client.get_artist_info(artist_url)
print(f"Artist: {artist.get('name', 'Unknown')}")
print(f"Followers: {artist.get('followers', {}).get('total', 'N/A'):,}")
print(f"Genres: {', '.join(artist.get('genres', []))}")
print(f"Popularity: {artist.get('popularity', 'N/A')}/100")
# Get top tracks
for track in artist.get('top_tracks', [])[:5]:
print(f" - {track.get('name', 'Unknown')}")
📋 Playlist Information
# Get playlist details
playlist = client.get_playlist_info(playlist_url)
print(f"Playlist: {playlist.get('name', 'Unknown')}")
print(f"Owner: {playlist.get('owner', {}).get('display_name', playlist.get('owner', {}).get('id', 'Unknown'))}")
print(f"Tracks: {playlist.get('track_count', 0)}")
print(f"Followers: {playlist.get('followers', {}).get('total', 'N/A'):,}")
# Get all tracks
for track in playlist['tracks']:
print(f" - {track.get('name', 'Unknown')} by {(track.get('artists', [{}])[0].get('name', 'Unknown') if track.get('artists') else 'Unknown')}")
🎙️ Podcast Support (NEW!)
Episode Information
# Get episode details
episode = client.get_episode_info(episode_url)
print(f"Episode: {episode.get('name', 'Unknown')}")
print(f"Show: {episode.get('show', {}).get('name', 'Unknown')}")
print(f"Duration: {episode.get('duration_ms', 0) / 1000 / 60:.1f} minutes")
print(f"Release Date: {episode.get('release_date', 'N/A')}")
print(f"Has Video: {'Yes' if episode.get('has_video') else 'No'}")
# Download episode preview (1-2 minute clip)
preview_path = client.download_episode_preview(
episode_url,
path="podcast_previews/",
filename="episode_preview"
)
print(f"Preview downloaded to: {preview_path}")
Show Information
# Get podcast show details
show = client.get_show_info(show_url)
print(f"Show: {show.get('name', 'Unknown')}")
print(f"Publisher: {show.get('publisher', 'Unknown')}")
print(f"Total Episodes: {show.get('total_episodes', 'N/A')}")
print(f"Categories: {', '.join(show.get('categories', []))}")
# Get recent episodes
for episode in show.get('episodes', [])[:5]:
print(f" - {episode.get('name', 'Unknown')} ({episode.get('duration_ms', 0) / 1000 / 60:.1f} min)")
CLI Commands for Podcasts
# Get episode info
spotify-scraper episode info https://open.spotify.com/episode/...
# Download episode preview
spotify-scraper episode download https://open.spotify.com/episode/... -o previews/
# Get show info with episodes
spotify-scraper show info https://open.spotify.com/show/...
# List show episodes
spotify-scraper show episodes https://open.spotify.com/show/... -o episodes.json
Note: Full episode downloads require Spotify Premium authentication. SpotifyScraper currently supports preview clips only.
📥 Media Downloads
# Download track preview (30-second MP3)
audio_path = client.download_preview_mp3(
track_url,
path="previews/",
filename="custom_name.mp3"
)
# Download cover art
cover_path = client.download_cover(
album_url,
path="covers/",
size_preference="large", # small, medium, large
format="jpeg" # jpeg or png
)
# Download all playlist covers
from spotify_scraper.utils.common import SpotifyBulkOperations
bulk = SpotifyBulkOperations(client)
covers = bulk.download_playlist_covers(
playlist_url,
output_dir="playlist_covers/"
)
🔄 Bulk Operations
from spotify_scraper.utils.common import SpotifyBulkOperations
# Process multiple URLs
urls = [
"https://open.spotify.com/track/...",
"https://open.spotify.com/album/...",
"https://open.spotify.com/artist/..."
]
bulk = SpotifyBulkOperations()
results = bulk.process_urls(urls, operation="all_info")
# Export results
bulk.export_to_json(results, "spotify_data.json")
bulk.export_to_csv(results, "spotify_data.csv")
# Batch download media
downloads = bulk.batch_download(
urls,
output_dir="downloads/",
media_types=["audio", "cover"]
)
📊 Data Analysis
from spotify_scraper.utils.common import SpotifyDataAnalyzer
analyzer = SpotifyDataAnalyzer()
# Analyze playlist
stats = analyzer.analyze_playlist(playlist_data)
print(f"Total duration: {stats['basic_stats']['total_duration_formatted']}")
print(f"Most common artist: {stats['artist_stats']['top_artists'][0]}")
print(f"Average popularity: {stats['basic_stats']['average_popularity']}")
# Compare playlists
comparison = analyzer.compare_playlists(playlist1, playlist2)
print(f"Common tracks: {comparison['track_comparison']['common_tracks']}")
print(f"Similarity: {comparison['track_comparison']['similarity_percentage']:.1f}%")
Advanced Configuration
Browser Selection
# Use requests (default, fast)
client = SpotifyClient(browser_type="requests")
# Use Selenium (for JavaScript content)
client = SpotifyClient(browser_type="selenium")
# Auto-detect (falls back to Selenium if needed)
client = SpotifyClient(browser_type="auto")
Authentication
# Using cookies file (exported from browser)
client = SpotifyClient(cookie_file="spotify_cookies.txt")
# Using cookie dictionary
client = SpotifyClient(cookies={"sp_t": "your_token"})
# Using headers
client = SpotifyClient(headers={
"User-Agent": "Custom User Agent",
"Accept-Language": "en-US,en;q=0.9"
})
Proxy Support
client = SpotifyClient(proxy={
"http": "http://proxy.example.com:8080",
"https": "https://proxy.example.com:8080"
})
Logging
# Set logging level
client = SpotifyClient(log_level="DEBUG")
# Or use standard logging
import logging
logging.basicConfig(level=logging.INFO)
API Reference
SpotifyClient
The main client for interacting with Spotify.
Methods:
get_track_info(url)- Get track metadataget_track_lyrics(url)- Get track lyrics (requires auth)get_track_info_with_lyrics(url)- Get track with lyricsget_album_info(url)- Get album metadataget_artist_info(url)- Get artist metadataget_playlist_info(url)- Get playlist metadatadownload_preview_mp3(url, path, filename)- Download track previewdownload_cover(url, path, size_preference, format)- Download cover artclose()- Close the client and clean up resources
SpotifyBulkOperations
Utilities for processing multiple URLs.
Methods:
process_urls(urls, operation)- Process multiple URLsexport_to_json(data, output_file)- Export to JSONexport_to_csv(data, output_file)- Export to CSVbatch_download(urls, output_dir, media_types)- Batch download mediaprocess_url_file(file_path, operation)- Process URLs from fileextract_urls_from_text(text)- Extract Spotify URLs from text
SpotifyDataAnalyzer
Tools for analyzing Spotify data.
Methods:
analyze_playlist(playlist_data)- Get playlist statisticscompare_playlists(playlist1, playlist2)- Compare two playlists
Examples
Download All Album Tracks
# Get album info
album = client.get_album_info(album_url)
# Download all track previews
for track in album['tracks']:
track_url = f"https://open.spotify.com/track/{track['id']}"
client.download_preview_mp3(track_url, path=f"album_{album.get('name', 'Unknown')}/")
Export Artist Discography
artist = client.get_artist_info(artist_url)
# Get all albums
albums_data = []
for album in artist['albums']['items']:
album_url = f"https://open.spotify.com/album/{album['id']}"
album = client.get_album_info(album_url)
albums_data.append(album_info)
# Export to JSON
import json
with open(f"{artist.get('name', 'Unknown')}_discography.json", "w") as f:
json.dump(albums_data, f, indent=2)
Create Playlist Report
from spotify_scraper.utils.common import SpotifyDataFormatter
formatter = SpotifyDataFormatter()
# Get playlist
playlist = client.get_playlist_info(playlist_url)
# Create markdown report
markdown = formatter.format_playlist_markdown(playlist)
with open("playlist_report.md", "w") as f:
f.write(markdown)
# Create M3U file
tracks = [item['track'] for item in playlist['tracks']]
formatter.export_to_m3u(tracks, "playlist.m3u")
Error Handling
from spotify_scraper.core.exceptions import (
SpotifyScraperError,
URLError,
ExtractionError,
DownloadError
)
try:
track = client.get_track_info(url)
except URLError:
print("Invalid Spotify URL")
except ExtractionError as e:
print(f"Failed to extract data: {e}")
except SpotifyScraperError as e:
print(f"General error: {e}")
Command Line Interface
# General syntax
spotify-scraper [COMMAND] [URL] [OPTIONS]
# Commands:
# track Get track information
# album Get album information
# artist Get artist information
# playlist Get playlist information
# download Download media files
# Global options:
# --output, -o Output file path
# --format, -f Output format (json, csv, txt)
# --pretty Pretty print output
# --log-level Set logging level
# --cookies Cookie file path
# Examples:
spotify-scraper track $URL --pretty
spotify-scraper album $URL -o album.json
spotify-scraper playlist $URL -f csv -o playlist.csv
spotify-scraper download track $URL --with-cover --path downloads/
Environment Variables
Configure SpotifyScraper using environment variables:
export SPOTIFY_SCRAPER_LOG_LEVEL=DEBUG
export SPOTIFY_SCRAPER_BROWSER_TYPE=selenium
export SPOTIFY_SCRAPER_COOKIE_FILE=/path/to/cookies.txt
export SPOTIFY_SCRAPER_PROXY_HTTP=http://proxy:8080
Requirements
- Python 3.8 or higher
- Operating System: Windows, macOS, Linux, BSD
- Dependencies:
- requests (for basic operations)
- beautifulsoup4 (for HTML parsing)
- selenium (optional, for JavaScript content)
Troubleshooting
Common Issues
1. SSL Certificate Errors
client = SpotifyClient(verify_ssl=False) # Not recommended for production
2. Rate Limiting
import time
for url in urls:
track = client.get_track_info(url)
time.sleep(1) # Add delay between requests
3. Cloudflare Protection
# Use Selenium backend
client = SpotifyClient(browser_type="selenium")
4. Missing Data
# Some fields might be None
track = client.get_track_info(url)
artist_name = track.get('artists', [{}])[0].get('name', 'Unknown')
Contributing
We welcome contributions! Please see our Contributing Guide for details.
# Clone the repository
git clone https://github.com/AliAkhtari78/SpotifyScraper.git
# Install development dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Run linting
black src/ tests/
flake8 src/ tests/
mypy src/
License
SpotifyScraper is released under the MIT License. See LICENSE for details.
Disclaimer
This library is for educational and personal use only. Always respect Spotify's Terms of Service and robots.txt. The authors are not responsible for any misuse of this library.
Support
- 📚 Documentation
- 🐛 Issue Tracker
- 💬 Discussions
SpotifyScraper - Extract Spotify data with ease 🎵
Made with ❤️ by Ali Akhtari
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file spotifyscraper-2.1.5.tar.gz.
File metadata
- Download URL: spotifyscraper-2.1.5.tar.gz
- Upload date:
- Size: 380.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4286a945bf8bdbae4fdf3517e081c9c5ff64e1d941eb8e75a79688ea766f3ada
|
|
| MD5 |
b82bf535efb22e14a0c153effddb2c0b
|
|
| BLAKE2b-256 |
3adbdb18751e6e6c932ddf81199325212cdd41ba841523d930b6393e2e4ffa36
|
File details
Details for the file spotifyscraper-2.1.5-py3-none-any.whl.
File metadata
- Download URL: spotifyscraper-2.1.5-py3-none-any.whl
- Upload date:
- Size: 127.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b08a28e9a0421a6aed80c6dc5fe66c7ad04914237af7ae3346b00ccb8e2b184c
|
|
| MD5 |
045f79f8d81d6ccf680ee531c29cfbbf
|
|
| BLAKE2b-256 |
0c90a2ed9abb3e3ec297a0eacb932b055a90f06ed9a8ed21fed26b65a5b30bba
|