Skip to main content

Downloads and Converts .sqls from data.ppy.sh into .csv

Project description

Data PPY CSV Retrieval

Retrieve data from the data ppy dump as CSV files.

:exclamation: Important

I have been given permission to upload the script, however, not the data.

Thus, if you want to upload the data elsewhere, please contact ppy through contact@ppy.sh.

All data provided here is done so with the intention of it being used for statistical analysis
and testing osu! subsystems.

Permission is NOT implicitly granted to deploy this in production use of any kind.
Should you wish to publicly use/expose the data provided here, please contact me first at contact@ppy.sh.

Please see https://github.com/ppy/osu-performance for more information.

Thanks,
ppy

Downloading & Converting

  1. pip install osu-data-csv

  2. run osu-data-csv in the terminal

    osu-data-csv
    

    A series of prompts should show up. See Arguments below for more info and examples

  3. (Alternatively) run in a single command

    osu-data-csv \
      -y "2022_12" \
      -d "mania" \
      -s "1000" \
      -l "data/" \
      -c "N" \
      -q "Y" \
      -i "path/to/ignore_mapping.yaml"
    

Arguments

Option Option (Shorthand) Desc. Example
--year_month -y Dataset Year and Month. Will fail if doesn't exist anymore 2022_10
--mode -d Gamemode. ['catch', 'mania', 'osu', 'taiko'] mania
--set -s Dataset of Top 1K or 10K players. ['1000', '10000'] 1000
--dl_dir -l Directory to download to. Best if empty. Can be not created. data/
--cleanup -c Whether to delete unused files after conversion. ['Y', 'N'] N
--bypass_confirm -q Whether to bypass confirmation of downloaded and new files. ['Y', 'N'] N
--ignore_path -i Path to YAML file ignore specification (see next section) path/to/ignore_mapping.yaml

It's set to retrieve the following:

osu_user_stats_<MODE>.sql
osu_scores_<MODE>_high.sql
osu_beatmap_difficulty.sql
osu_beatmaps.sql

Selecting Columns

It's slow as it converts all columns. To speed this up, and reduce space taken, it's best to use --ignore_path with a YAML file.

  1. Download the template ignore_mapping.yaml here
  2. Comment out fields that you want to include.
  3. Call osu-data-csv -i path/to/ignore_mapping.yaml [other options]

For example

osu_beatmap_difficulty.sql:
#  - beatmap_id
  - mode
  - mods
  - diff_unified
  - last_update
osu_beatmaps.sql:
  - beatmap_id
  - beatmapset_id
#  - user_id
#  - filename
...

We'll retrieve beatmap_id from osu_beatmap_difficulty.sql and user_id, file_name from osu_beatmaps.sql

Output

This will generate a few files. You'd want to retrieve the .csv.

- main.py 
- <dl_dir>/
  - 202X_XX_01_performance_<MODE>_top_<SET>.tar.bz2 (*)
  - 202X_XX_01_performance_<MODE>_top_<SET>/
    - csv/
      - osu_user_stats_<MODE>.csv
      - _.csv
      - ...
    - osu_user_stats_<MODE>.sql (*)
    - _.sql (*)
    - ...
  • (*) files are deleted if cleanup is enabled.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

osu_data_csv-0.1.6.tar.gz (8.3 kB view details)

Uploaded Source

Built Distribution

osu_data_csv-0.1.6-py3-none-any.whl (9.7 kB view details)

Uploaded Python 3

File details

Details for the file osu_data_csv-0.1.6.tar.gz.

File metadata

  • Download URL: osu_data_csv-0.1.6.tar.gz
  • Upload date:
  • Size: 8.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.2 CPython/3.9.13 Windows/10

File hashes

Hashes for osu_data_csv-0.1.6.tar.gz
Algorithm Hash digest
SHA256 5a01e230e21339f96de692064f6200aa9aa3e9f043d1bf88356e36ce8c5b2b7a
MD5 5a30cee36c11b40b41edc80c1c4190a3
BLAKE2b-256 32e19cae6f8061fb5e4163dca3cad2a63e45e41966c6af85e59ba27fb3278f45

See more details on using hashes here.

File details

Details for the file osu_data_csv-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: osu_data_csv-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 9.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.2 CPython/3.9.13 Windows/10

File hashes

Hashes for osu_data_csv-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 72c1bc3cea16b90bfa8d6c3b72dcea6f5e9d623b8a06079f749d12849cd0d249
MD5 c2f1b8955b76b85fb59fb011924a9a24
BLAKE2b-256 65fcab602b6202172260b6e98ee7dc6239e004ad7094a26665051d09812707b2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page