Downloads and Converts .sqls from data.ppy.sh into .csv
Project description
Data PPY CSV Retrieval
Retrieve data from the data ppy dump as CSV files.
:exclamation: Important
I have been given permission to upload the script, however, not the data.
Thus, if you want to upload the data elsewhere, please contact ppy through contact@ppy.sh.
All data provided here is done so with the intention of it being used for statistical analysis
and testing osu! subsystems.
Permission is NOT implicitly granted to deploy this in production use of any kind.
Should you wish to publicly use/expose the data provided here, please contact me first at contact@ppy.sh.
Please see https://github.com/ppy/osu-performance for more information.
Thanks,
ppy
Downloading & Converting
-
pip install osu-data-csv
-
run
osu-data-csv
in the terminalosu-data-csv
A series of prompts should show up. See Arguments below for more info and examples
-
(Alternatively) run in a single command
osu-data-csv \ -y "2022_12" \ -d "mania" \ -s "1000" \ -l "data/" \ -c "N" \ -q "Y" \ -i "path/to/ignore_mapping.yaml"
Arguments
Option | Option (Shorthand) | Desc. | Example |
---|---|---|---|
--year_month | -y | Dataset Year and Month. Will fail if doesn't exist anymore | 2022_10 |
--mode | -d | Gamemode. ['catch', 'mania', 'osu', 'taiko'] | mania |
--set | -s | Dataset of Top 1K or 10K players. ['1000', '10000'] | 1000 |
--dl_dir | -l | Directory to download to. Best if empty. Can be not created. | data/ |
--cleanup | -c | Whether to delete unused files after conversion. ['Y', 'N'] | N |
--bypass_confirm | -q | Whether to bypass confirmation of downloaded and new files. ['Y', 'N'] | N |
--ignore_path | -i | Path to YAML file ignore specification (see next section) | path/to/ignore_mapping.yaml |
It's set to retrieve the following:
osu_user_stats_<MODE>.sql
osu_scores_<MODE>_high.sql
osu_beatmap_difficulty.sql
osu_beatmaps.sql
Selecting Columns
It's slow as it converts all columns. To speed this up, and reduce space taken, it's best to use --ignore_path
with a YAML file.
- Download the template
ignore_mapping.yaml
here - Comment out fields that you want to include.
- Call
osu-data-csv -i path/to/ignore_mapping.yaml [other options]
For example
osu_beatmap_difficulty.sql:
# - beatmap_id
- mode
- mods
- diff_unified
- last_update
osu_beatmaps.sql:
- beatmap_id
- beatmapset_id
# - user_id
# - filename
...
We'll retrieve beatmap_id
from osu_beatmap_difficulty.sql
and user_id
, file_name
from osu_beatmaps.sql
Output
This will generate a few files. You'd want to retrieve the .csv
.
- main.py
- <dl_dir>/
- 202X_XX_01_performance_<MODE>_top_<SET>.tar.bz2 (*)
- 202X_XX_01_performance_<MODE>_top_<SET>/
- csv/
- osu_user_stats_<MODE>.csv
- _.csv
- ...
- osu_user_stats_<MODE>.sql (*)
- _.sql (*)
- ...
(*)
files are deleted ifcleanup
is enabled.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file osu_data_csv-0.1.6.tar.gz
.
File metadata
- Download URL: osu_data_csv-0.1.6.tar.gz
- Upload date:
- Size: 8.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.3.2 CPython/3.9.13 Windows/10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5a01e230e21339f96de692064f6200aa9aa3e9f043d1bf88356e36ce8c5b2b7a |
|
MD5 | 5a30cee36c11b40b41edc80c1c4190a3 |
|
BLAKE2b-256 | 32e19cae6f8061fb5e4163dca3cad2a63e45e41966c6af85e59ba27fb3278f45 |
File details
Details for the file osu_data_csv-0.1.6-py3-none-any.whl
.
File metadata
- Download URL: osu_data_csv-0.1.6-py3-none-any.whl
- Upload date:
- Size: 9.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.3.2 CPython/3.9.13 Windows/10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 72c1bc3cea16b90bfa8d6c3b72dcea6f5e9d623b8a06079f749d12849cd0d249 |
|
MD5 | c2f1b8955b76b85fb59fb011924a9a24 |
|
BLAKE2b-256 | 65fcab602b6202172260b6e98ee7dc6239e004ad7094a26665051d09812707b2 |