Music benchmark library
Project description
Holistic Music Benchmark System: HY-MAST
Official codes and tools for paper: HY-MAST: A Holistic Benchmark for Evaluating Multimodal Language Models on Music Understanding
HY-MAST is a holistic music benchmark system capable of automatically generating comprehensive music understanding tasks and multimodal musical content. It enables comprehensive evaluation of the music understanding performance of large language models (LLMs). Its web-based application can be employed for fundamental music assessment intended for human musicians.
Question Type
HY-MAST can automatically generate a rich variety of question types, thereby comprehensively evaluating the multidimensional music understanding capabilities of LLMs. The automatically generated question types include:
Pitch Identification, Interval Identification, Chord Identification, Rhythm Transcription, Tempo Estimation, Tonality Identification, Meter Identification, Harmonic Analysis, Instrument Identification, Genre Identification, Lyric Transcription, Melody Transcription, Work Identification
Music Content and sources
For its LLM-oriented evaluation framework, HY-MAST provides audio music, symbolic sheet music in ABC notation, and music theory articles. Audio music is derived from two sources: automatically synthesized standard music samples and authentic human-composed works.
Online Automatic synthesis and AI Melody Generation
HY-MAST incorporates a complete set of tools for the automatic synthesis of standard samples. Guided by the rigorous norms of musicological theory, it can automatically synthesize accurate standard music samples (No AI) in accordance with specified instructions.
-
Automatic synthesis
Currently, this project supports the automatic synthesis of the following standard samples:
single note, interval notes, chord, scale, rhythm
Automatic synthesis tools: Auto_Benchmark_Test_for_LLM_main.py
Main function: randomly_make_questions_with_online_synthesis -
AI Melody Generation and Automatic Audio Rendering
For the questions of Melody Transcription and Time Signature Recognition tasks, the target melody to be recognized is generated by an LM. The LM model first produces content in ABC notation, which is then automatically converted to MIDI and rendered into WAV format, before being finally supplied to the question system.
The LM model and inference framework are saved in the folder: melody_generation -
sheet music images
For the web application designed for human assessment, HY-MAST has added staff notation recognition tasks and automatically generates corresponding sheet music images based on the questions.
Tools at music21_save_picture.py
Real Music
We carefully select a large number of human-composed musical works. The labels and information for these music pieces are stored in the file: real_audio/metadata.jsonl
Real-music resources are freely available for download. Their categories and corresponding online sources are listed as follows:
- Pop Music
https://www.billboard.com/charts/greatest-hot-100-singles
save folder: real_audio/pop/audio/ - Classical Music
https://www.kaggle.com/datasets/imsparsh/musicnet-dataset
save folder: real_audio/classical/audio/ - World Music
3.1. Turkey
https://zenodo.org/records/1283349
save folder: real_audio/folk/turkey/audio/
3.2. Arab
https://zenodo.org/records/1291776
save folder: real_audio/folk/arab/audio/
3.3. China
https://zenodo.org/records/344932
save folder: real_audio/folk/china/audio/
3.4 India
https://zenodo.org/records/4301737
save folder: real_audio/folk/india/audio/
Please download the music files from the above websites and save them to the designated folders.
Install and Usage
requirement
- Python library
music21
pydub
cairosvg
torch==2.6.0
transformers==4.57.6 - Conda library
fluidsynth
Recommended installation method:
conda install -c conda-forge fluidsynth
Usage
-
Main program
Run the main program directly to generate the complete Benchmark:
python Auto_MultiModal_Music_Benchmark_Main.py --seed 0 --number_of_processes 2 --each_question_number 4
each_question_number: Number of questions per category
number_of_processes: Number of processes
Optional parameters:
--music_data_dir, type=str, default="real_audio", help="Path of the real music dataset folder. Please provide the absolute path.
Attention! Each_question_number must be divisible by number_of_processes. -
Tools
Automatic fundamental perception questions generation tools:
Auto_Benchmark_Test_for_LLM_main.py Automatic synthesis tools:
Algorithm_for_Music_Synthesize.py
Batch generated abc to wav tools:
lm_gen_abc_to_audio.py
pseudorandom tools:
pseudorandom_tools.py
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file music_benchmark-0.0.2.tar.gz.
File metadata
- Download URL: music_benchmark-0.0.2.tar.gz
- Upload date:
- Size: 92.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c1fc4f20d600aae72bc07a30e30847c382ccede8061245b3f55212c584041130
|
|
| MD5 |
53120b0b9962802c535aa2fdabc274fd
|
|
| BLAKE2b-256 |
393872a648f14e887a725ca46b5a5f4849195fd9d2c81fd5b1f3aeb3ed19a7e9
|
File details
Details for the file music_benchmark-0.0.2-py3-none-any.whl.
File metadata
- Download URL: music_benchmark-0.0.2-py3-none-any.whl
- Upload date:
- Size: 98.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
00800b9adb2fd244ddf0b1e2aa23df76efdba8e2b1df09c9eae674834c1233bc
|
|
| MD5 |
f0f51df64835cd9076aa5b8bcaaeca35
|
|
| BLAKE2b-256 |
c58a6876e978e86dbdd4990e1242ec333c9c8cdb9576c9f15922ce773d4c7bc1
|