Skip to main content

The clean and modern way of accessing IMSLP data and scores programmatically.

Project description

imslp

pytest codecov Documentation Status Downloads Run on Repl.it Stargazers

🎼 The clean and modern way of accessing IMSLP data and scores programmatically. 🎶

Installation

The package is available on PyPi and can be installed using your favorite package manager:

pip install imslp

Data Sources

This project attempts to use robust sources of data, that do not require web scraping of some sort:

  • MediaWiki API. IMSLP is one of tens of thousands of websites built on top of MediaWiki, the framework created for Wikipedia.org. As such, it can be accessed through the MediaWiki API for which, fortunately, there exists a fantastic Python wrapper library called mwclient.

  • IMSLP API. For convenience, the IMSLP built some ad-hoc scripts that can be used to get a list of people and a list of works, in a variety of different formats, including JSON.

It also uses scraping to collect additional information (such as the number of pages in a score, the number of times a score was downloaded, or the user-provided ratings).

Some quirks of IMSLP

While fortunately, as mentioned, IMSLP uses a widely used open-source Wiki platform, MediaWiki, it has a handful of quirks. Such as:

  • Composers are stored as Category, for instance Category:Scarlatti, Domenico. For each composer, there is usually three tabs: "Compositions", "Collaborations" and "Collections"; these are stored as separate categories resulting from the concatenation of the composer and subtype, such as Category:Scarlatti, Domenico/Collections.

  • PDF files for sheet music are stored as "images"; unfortunately, for the time being, the scheme does not appear in the URLs computed for the files. These need to be manually patched.

  • The imslpdisclaimeraccepted cookie must be set to "yes" for files to download properly (otherwise, downloading any file will result in the disclaimer page). With mwclient, this can be specified on login.

    cookies = {
        "imslp_wikiLanguageSelectorLanguage": "en",
        "imslpdisclaimeraccepted": "yes",
    }
    
  • Much of the metadata associated with images, such as the internal ID or the download counter, is stored separately than the MediaWiki metadata. This makes scraping the rendered HTML page a necessary endeavour.

Fortunately all these quirks are handled by this package!

Related Projects

Here are a handful of other related projects available on GitHub to access the IMSLP data programmatically:

  • jjjake/imslp-scrape: Last commit in May 2012 (32 commits), mix of Python and shell, scraping the website for data (people, score links) with HTML parsing.

  • FrankTheCodeMonkey/IMSLP-Scraper: Last commit in June 2020 (6 commits), Python, scraping the website for data and scores, with HTML parsing and Selenium.

  • josefleventon/imslp-api: Last commit in May 2020 (17 commits), JavaScript, uses IMSLP's custom API to get the list of people and list of works programmatically through a web API query.

More recently, and in other languages:

Acknowledgements

Let's be clear that all the heavy lifting is done by mwclient—and the volunteers who uploaded and/or scanned and/or typeset the scores on IMSLP.

License

This project is licensed under the LGPLv3 license, with the understanding that importing a Python modular is similar in spirit to dynamically linking against a library.

  • You can use the library imslp in any project, for any purpose, as long as you provide some acknowledgement to this original project for use of the library.

  • If you make improvements to imslp, you are required to make those changes publicly available.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

imslp-0.2.3.tar.gz (14.0 kB view details)

Uploaded Source

Built Distribution

imslp-0.2.3-py3-none-any.whl (19.1 kB view details)

Uploaded Python 3

File details

Details for the file imslp-0.2.3.tar.gz.

File metadata

  • Download URL: imslp-0.2.3.tar.gz
  • Upload date:
  • Size: 14.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.1 CPython/3.9.4 Darwin/22.1.0

File hashes

Hashes for imslp-0.2.3.tar.gz
Algorithm Hash digest
SHA256 ecc731560c0e55460b6506bb3d7db30d9dd7ca15daa7a68f545f24ca59a46277
MD5 88abd376fdb52da174797cfc9c3b1ec7
BLAKE2b-256 a03984eaea22a89d52e479149c3aa92dc5aa85a00da4b793439d689ca550717b

See more details on using hashes here.

File details

Details for the file imslp-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: imslp-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 19.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.1 CPython/3.9.4 Darwin/22.1.0

File hashes

Hashes for imslp-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 f5bb7a3b9a401ea685889dfad21403cfdf4eeac066d2a98e5c6b78939d72e27d
MD5 ea2adf48f53f75964a3bdc90e68f66df
BLAKE2b-256 a3ffd4d989ccaa27861fd9bb33a4bb2144fb9d8733088915b31bb08f2bba4ef5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page