Skip to main content

Python package which takes the songs of a greater playlist as starting point to make recommendations of songs based on up to 5 specific songs within that playlist, using K-Nearest-Neighbors Technique

Project description

spotify-recommender

Use Case

  • This is the first section of this readme, because, you will see, this package can help, but nothing is perfect, so it will, as long as you fit in this very, very, particular use case ;(
  • The perfect use case for this is that one playlist (or more) that you put a bunch of songs in different times and mood styles, and when you listen to it, you feel like only listening to a part of it, some days later, that part is useless, but some other part is awesome. The big issue here is that those "parts" are shuffled all across the playlist. Then how would one find those songs that they are craving for, today, tomorrow, and later? Speaking from experience, it is not worth it to map manually a 1000 song playlist and filter out 50, or 100.
  • This package comes to solve this issue, roughly, because it tries to find the K (number) nearest songs taking into consideration genres, artists and popularity, using the KNN supervised machine learning technique
  • One issue with this is that spotify api is not the best, E.g. A LOT of artists do not have any genre associated to them, which, as justified in the next topic, is the main source of genre information used in the algorithm
  • Other issue is that spotify api does not provide, at the time of publish of version 2.5.0, neither song nor album genres, which compromise a portion of the accuracy of the recommendations, still i recommend you give it a try

Setup

Requirements:

  • Python installed
    The ideal version, to run the package is 3.8.x, the version in which the package was built over, older versions many have some issues, as the package uses a handful of other packages and their versions may conflict

  • Network Connection
    So that a wide range of songs can be analised, it is imperative to have a network connection, at least for the first time executing a script using this package

  • A fitting playlist
    The perfect use case for this package is that of one big playlist (200+ songs), which you feel like listening to some of them, then others but never, or rarely, all of them, since they belong to diferent genres/styles

  • Patience

    It may seem funny or a joke, but the first mapping process of the playlist to a local pandas DataFrame, it will take a good while, up to 2.5 to 3 second per song, at 20-40Mbps Internet connection, being in Latam. All these factors play a part in the time for it to load. Just to make it clear, cpu, ram, these will not help much, the issue is to have up to 5 different http requests per song, which make this take so long

  • Jupyter Notebook
    Not exactly a requirement but it is advised that a jupyter notebook is used ( even more advised to use the vscode extension for jupyter notebooks ), because it is important, or at least more confortable, to have the variable still in memory and then decide how to use it, without having to run the script multiple times

  • Spotify access
    I mean, you knew that already, right?

  • Installing the package

pip install spotify-recommender-api
  • Importing the package

Firstly, it's necessary to import the method start_api from the package spotify_recommender_api.recommender:

from spotify_recommender_api.recommender import start_api

Starting the api

  • Gathering the initial information: (playlist_url, user_id)

--- Playlist URL: The playlist url is available when right clicking the playlist name / or going to the three dots that represent the playlist options
--- Playlist ID: The playlist id is available the hash string between the last '/' in and the '?' in the playlist url

--- User ID: The user id is available when clicking the account, and accessing its information, on spotify's website

  • Calling the function:
api = start_api(playlist_url='<PLAYLIST_URL>', user_id='<USER_ID>')

Or

api = start_api(playlist_id='<PLAYLIST_ID>', user_id='<USER_ID>')

Though, to be honest, it is easier and more convenient to use the playlist URL

  • Getting the Auth Token: It is a hash token that expires 60 minutes after it is generated, first you need to say that you want to be redirected (y) But if it is not the first time you are executing the script in less than an hour, then press(n) and paste the token
    Otherwise press "Get Token", and then select the 5 scope options:


Then request it, after that hit crtl+A / command+A to select it all then crtl+C / command+C to copy it. Then, back to python, paste it in the field requiring it and press enter. Then if you already have a previously generated CSV file format playlist, type csv then hit enter, if you do not have the playlist as previously generated, type web, but know that it will take a good while as said here,and if this is the case, go get a cup of coffee, tea, or whatever you are into.

Methods

  • get_playlist
# Method Use Example
api.get_playlist()
# Function that returns the pandas DataFrame representing the base playlist
  • playlist_to_csv
# Method Use Example
api.playlist_to_csv()
# Function that creates a csv format file containing the items in the playlist
# Especially useful when re running the script without having changed the playlist
  • get_medium_term_favorites_playlist
# Parameters
get_medium_term_favorites_playlist(with_distance: bool, generate_csv: bool, 
                        generate_parquet: bool, build_playlist: bool)
# Method Use Example
api.get_medium_term_favorites_playlist(generate_csv=True, build_playlist=True)
# Function that returns the pandas DataFrame representing the 
# medium term top 5 recommendation playlist
# All parameters are defaulted to False
# The "distance" is a mathematical value with no explicit units, that is 
# used by te algorithm to find the closest songs
# BUILD_PLAYLIST WILL CHANGE THE USER'S LIBRARY IF SET TO TRUE
  • get_short_term_favorites_playlist
# Parameters
get_short_term_favorites_playlist(with_distance: bool, generate_csv: bool, 
                        generate_parquet: bool, build_playlist: bool)
# Method Use Example
api.get_short_term_favorites_playlist(generate_csv=True, build_playlist=True)
# Function that returns the pandas DataFrame representing the 
# short term top 5 recommendation playlist
# All parameters are defaulted to False
# The "distance" is a mathematical value with no explicit units, that is 
# used by te algorithm to find the closest songs
# BUILD_PLAYLIST WILL CHANGE THE USER'S LIBRARY IF SET TO TRUE
  • get_recommendations_for_song
# Parameters
get_recommendations_for_song(song: str, K: int, with_distance: bool, generate_csv: bool, 
                        generate_parquet: bool, build_playlist: bool, print_base_caracteristics: bool)
# Method Use Example
api.get_recommendations_for_song(song='<SONG_NAME>', K=50)
# Function that returns the pandas DataFrame representing the 
# given song recommendation playlist
# the 'song' and 'K' parameters are mandatory and the rest is
# defaulted to False
# The "distance" is a mathematical value with no explicit units, that is 
# used by te algorithm to find the closest songs
# print_base_caracteristics will display the parameter song information
# Note that it can be used to update a playlist if the given song already
# has its playlist generated by this package
# BUILD_PLAYLIST WILL CHANGE THE USER'S LIBRARY IF SET TO TRUE
  • get_most_listened
# Parameters
get_most_listened(time_range: str = 'long', K: int = 50, build_playlist: bool = False)
# Method Use Example
api.get_most_listened(time_range='short', K=53)
# Function that returns the pandas DataFrame representing the 
# given time range most listened tracks playlist
# No parameters are mandatory but the default values should be noted 
# BUILD_PLAYLIST WILL CHANGE THE USER'S LIBRARY IF SET TO TRUE
  • update_all_generated_playlists

WILL CHANGE THE USER'S LIBRARY DRAMATICALLY

# Parameters
update_all_generated_playlists(K: int = 50)
# Method Use Example
api.update_all_generated_playlists()
# Function updates all the playlists once generated by this package in batch 
# Note that if only a few updates are preferred, the methods above are a better fit
# No parameters are mandatory but the default values should be noted 
  • get_playlist_trending_genres
# Parameters
get_playlist_trending_genres(time_range: str = 'all_time', plot_top: int|bool = False)
# Method Use Example
api.get_playlist_trending_genres()
# Function that returns a pandas DataFrame with all genres within the playlist and both their 
# overall appearance and the percentage of their appearance over the entire playlist
# in the given time_range
# Setting the plot_top parameter to one of the following [5, 10, 15] will plot a barplot
# with this number of the most listened genres in the playlist
  • get_playlist_trending_artists
# Parameters
get_playlist_trending_artists(time_range: str = 'all_time', plot_top: int|bool = False)
# Method Use Example
api.get_playlist_trending_artists()
# Function that returns a pandas DataFrame with all artists within the playlist and both their
# overall appearance and the percentage of their appearance over the entire playlist
# in the given time_range
# Setting the plot_top parameter to one of the following [5, 10, 15] will plot a barplot
# with this number of the most listened artists in the playlist

OG Scripts

###DEPRECATED###

Context

This script, in jupyter notebook format for organization purposes, applies the technique called K Nearest Neighbors to find the 50 closest songs to either one chosen or one of the users top 5(short term), all within a specific Spotify playlist, in order to maintain the most consistency in terms of the specific chosen style, and creates a new playlist with those songs in the user's library, using their genres, artists and overall popularity as metrics to determine indexes of comparison between songs

Variations

There are also 2 variations from that, which consist of medium term favorites related top 100 and "short term top 5" related top 50 songs. They vary from OG model since the base song(s) is(are) not chosen by hand but statistically

DISCLAIMER

Not fit for direct use since some information such as client id, client secret, both of which are, now, in a hidden script on .gitignore so that it is not made public, have to be informed in order for the Spotify Web API to work properly. And also, these scripts are deprecated, so they will not have any maintenance or overtime improvements / new features

Packages used

  • Pandas
pip install pandas
  • Requests
pip install requests
  • Seaborn
pip install seaborn
  • Datetime (datetime)
  • Dateutil (dateutil)
  • Webbrowser (webbrowser)
  • Json (json)
  • Operator (operator)
  • Functools (functools)
  • Os (os)
  • Re (re)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spotify_recommender_api-3.0.0.tar.gz (22.1 kB view hashes)

Uploaded Source

Built Distribution

spotify_recommender_api-3.0.0-py3-none-any.whl (19.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page