Skip to main content

Package for converting music metadata to XML

Project description

musicscan - Music File Scanner

The musicscan package is a software library for extracting metadata from a digital music collection and builds a set of XML files adhering to the vtmedia schema.

It includes a tool that will recursively scan a directory for audio files containing ID3 tags and uses that data as the basis for bulding XML files.

How It Works

It works under the assumption that the digital library was created by importing CDs into a music ecosystem like iTunes or Windows Media Player; so it uses the nomenclature of physical CDs. That is, every audio file scanned represents a single track that was imported from a physical Compact Disc, and it will use the metadata to build the XML files. If the metadata does not include information like track number, or disc number; the import will probably not work.

The benefits of extracting the metadata are:

  1. Maintaining a separate copy of the metadata away from the music library.
  2. Extending the metadata with extra fields that may not be supported with ID3 tags.
  3. Searching the metadata with tools that may not be available to your music player.
  4. Sharing the metadata with other users.

XML Schema

The XML generated from the tool adheres to the VTMedia schema, which can be found here.

Repository Purpose
vtmedia-schema Schema and XML validation for media data

The schema can be loaded into command line tools, IDEs, or custom code applications to examine the validity of the metadata files. It also contains example music data that has been generated using the id3tool code, and then edited for accuracy.

Scanning Example

If you have a directory like this.

$ ls -1 "~/Music/iTunes/iTunes Media/Music/Garth Brooks/No Fences/"
01 The Thunder Rolls.m4a
02 New Way To Fly.m4a
03 Two Of A Kind, Workin' On A Full House.m4a
04 Victim Of The Game.m4a
05 Friends In Low Places.m4a
06 Wild Horses.m4a
07 Unanswered Prayers.m4a
08 Same Old Story.m4a
09 Mr. Blue.m4a
10 Wolves.m4a

The id3scan tool will search that directory and create three files. Make sure the install path for the id3scan tool matches your environment command path.

$ id3scan --musicpath "~/Music/iTunes/iTunes Media/Music/Garth Brooks/No Fences" --write --outdir ~/tmp --split-xml
$ ls -1 ~/tmp/
garth_brooks_no_fences-1990-album.xml
garth_brooks_no_fences-1990-audiocd.xml
garth_brooks_no_fences-1990-cd01-index.xml

Each file contains different aspects of the album data.

File Data
garth_brooks_no_fences-1990-audiocd.xml Information on the physical media
garth_brooks_no_fences-1990-album.xml Information about each song
garth_brooks_no_fences-1990-cd01-index.xml Track order information for each song

The XML files are nested together with XInclude directives tying them together. If a user skips the ---split-xml diretive, only one output file is generated.

The Audio CD file

The audio cd file is the main file of the data structure. Most of the relevant information is on the physical CD structure.

<medialist xmlns='http://vectortron.com/xml/media/media' xmlns:xi='http://www.w3.org/2001/XInclude'>
 <!-- created by id3scan (2024-04-06 15:33:06.931311) -->
 <media>
  <title>
   <main>No Fences</main>
 </title>
 <medium>
 <release>
  <type><audiocd/></type>
 </release>
 <productSpecs>
  <inventory>
   <case>
    <cd id='cd01'/>
   </case>
  </inventory>
 </productSpecs>
</medium>

The Index File

The index file contains information about track order, with references to the songs on the album.

<cdIndex ref='cd01' xmlns='http://vectortron.com/xml/media/media'>
 <track no='1'>
  <index no='01'>
   <content ref='ttr01'/>
  </index>
</track>
...

The Album File

The album file contains all of the information about the songs.

<album xmlns='http://vectortron.com/xml/media/audio'>
 <title>No Fences</title>
 <catalog>
  <artists>
  <artist><unkn>Garth Brooks</unkn></artist>
  </artists>
 </catalog>
 <classification>
  <genres>
   <primary>Country</primary>
  </genres>
 </classification>
 <elements>
  <song id='w01'>
   <title>
    <main>Wolves</main>
   </title>
   <catalog>
    <composers>
     <composer><unkn>Stephanie Davis</unkn></composer>
    </composers>
    </catalog>
    <technical>
     <studioRecording/>
     <runtime>
      <overall>PT4M8.89S</overall>
     </runtime>
    </technical>
   </song>
...

Documentation

There is RST documentation for the id3scan.rst tool in the doc directory.

There is an EDITING.md file with documentation on how to edit the XML generated by the id3scan code.

Building And Installing From Source Code

Assuming a normal Python 3 environment with setuptools and build modules installed, run the build module in the top level directory of the repository.

$ cd musicscan
$ python -m build 

This code has a dependency on the TinyTag module, which should automatically be installed during the build process.

Package Distribution

PyPi version

Installation can be done through the python pip command.

$ python -m pip install --user musicscan

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

musicscan-0.1.7.tar.gz (22.5 kB view details)

Uploaded Source

Built Distribution

musicscan-0.1.7-py3-none-any.whl (37.8 kB view details)

Uploaded Python 3

File details

Details for the file musicscan-0.1.7.tar.gz.

File metadata

  • Download URL: musicscan-0.1.7.tar.gz
  • Upload date:
  • Size: 22.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for musicscan-0.1.7.tar.gz
Algorithm Hash digest
SHA256 623e087f1c90e0f139b98a87d9c237356458780cd797e183c6c1e244b37cd3e2
MD5 043f04086a8c04b379b38f57e5deacc8
BLAKE2b-256 8ee0bbd4663896db2fe52763359ff179701dba017e3dd8f13f85390acc37bfd9

See more details on using hashes here.

File details

Details for the file musicscan-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: musicscan-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 37.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for musicscan-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 0c687cd5640e492ff11ca30c09fd5841859d72c5fb3f2c1b7ae9659f9dcb337e
MD5 54940ad76bc9c3fbb02264e9990fa2a3
BLAKE2b-256 a7f37f34828c7ee36f5885bad8bb541b124b766ec2c7ef6f9c4b87503465cdd5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page