Skip to main content

Tool for autodownloading recommendation systems datasets

Project description

Welcome to rs_datasets

This tool allows you download, unpack and read recommender systems datasets into pandas.DataFrame as easy as data = Dataset().

Installation

pip install git+https://github.com/Darel13712/rs_datasets.git

Available datasets

The following datasets are available for automatic download and can be retrieved with this package.

Note: Check dataset license to know available usecases. Authors of this package are not affiliated with dataset contents in any way.

Dataset Users Items Interactions
Movielens 162k 62k 25m
Million Song Dataset 1m 385k 48m
Netflix 480k 17.7k 100m
Goodreads 800k 1.5m 225m
Last.fm 360k 290k 17.5m
Epinions 49k 140k 660k
Book Crossing 279k 271k 1.1m
Dating Agency 135k 169k 17.3m
Jester 73k 100 4.1m

Example of use

from rs_datasets import MovieLens
ml = MovieLens()
ml.info()
ratings
   user_id  item_id  rating  timestamp
0        1        1     4.0  964982703
1        1        3     4.0  964981247
2        1        6     4.0  964982224
items
   item_id  ...                                       genres
0        1  ...  Adventure|Animation|Children|Comedy|Fantasy
1        2  ...                   Adventure|Children|Fantasy
2        3  ...                               Comedy|Romance
[3 rows x 3 columns]
tags
   user_id  item_id              tag   timestamp
0        2    60756            funny  1445714994
1        2    60756  Highly quotable  1445714996
2        2    60756     will ferrell  1445714992
links
   item_id  imdb_id  tmdb_id
0        1   114709    862.0
1        2   113497   8844.0
2        3   113228  15602.0

Loaded DataFrames are available as class attributes.

Affiliation

This package was developed with a help of my colleagues during my work at Sberbank AILab.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rs_datasets-0.1.10.tar.gz (11.9 kB view hashes)

Uploaded Source

Built Distribution

rs_datasets-0.1.10-py3-none-any.whl (17.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page