Skip to main content

Music file hasher

Project description

# mp3hash

Hashes music files ignoring meta-data.

Useful to detect the same song in different tagged files.

# Use

Similarly to `sha1sum` or `md5sum`, it takes one or more files and returns the hashes, in this way:

$ ./mp3hash *.mp3
6611bc5b01a2fc6a6386a871e8c51f86e1f12b33 13_Hotel-California-(Gipsy-Kings).mp3
6611bc5b01a2fc6a6386a871e8c51f86e1f12b33 14_Hotel-California-(Gipsy-Kings).mp3

It returns the same hash number, even though the tags are different, and so their regular hashes:

$ sha1sum *.mp3
6a1d5f8317add10e205ae30174630b47645fb5b4 13_Hotel-California-(Gipsy-Kings).mp3
c28d6976114d31df3366d9935eb0bedd36cf1f0b 14_Hotel-California-(Gipsy-Kings).mp3

The hash it's made strictly using the music data in the file, by calculating the tags sizes and
omitting them.

The default hashing algorithm is `sha-1`, but any algorithm can be used as long it's supported by
the Python's `hashlib` module. A complete list of all available hashing algorithms can be obtained
by calling the program with the `--list-algorithms`.

$ ./mp3hash --list-algorithms
md5
sha1
sha224
sha256
sha384
sha512

./mp3hash --algorithm md5
ac0fdd89454528d3fbdb19942a2e6653 13_Hotel-California-(Gipsy-Kings).mp3
ac0fdd89454528d3fbdb19942a2e6653 14_Hotel-California-(Gipsy-Kings).mp3

# Install

It doesn't have any dependences besides `python2.6+` so you should be able to run the script
straight.

# Technical details

Supported and ignored meta-data tags are: id3v1, id3v2 both in their simple and indexed forms


## About id3v1

- id3v1 is 128 bytes at the end of the file starting with 'TAG'
- id3v1 extended is 227 bytes before regular id3v1 tag starting with 'TAG+'

total size: 128 + (227 if extended)

## About id3v2

- id3v2 header have the following fields alog the 10 first bytes in the file
- byte 5 holds flags. 6th bit indicates extended tag
- bytes 6-10 are the tag size (not counting header)

- id3v2 extended has a 10 bytes header after the regular id3v2
- bytes 1-4 are the tag size (not counting header nor padding)
- bytes 5-6 holds some flags. Leftmost bit indicates CRC presence
- bytes 6-10 are the tag padding size (extra blank size within tag)

total size: 10 + tagsize + (10 + etagsize + padding if extended)

Based on id3v1 wikipedia docs: http://en.wikipedia.org/wiki/ID3
Based on id3v2 docs: http://www.id3.org/id3v2.3.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mp3hash-0.0.1.tar.gz (4.1 kB view details)

Uploaded Source

File details

Details for the file mp3hash-0.0.1.tar.gz.

File metadata

  • Download URL: mp3hash-0.0.1.tar.gz
  • Upload date:
  • Size: 4.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for mp3hash-0.0.1.tar.gz
Algorithm Hash digest
SHA256 0b1193bcd88a22f81db43ee95181b4fbf857bf035fbfdffa646f4fc9efe0272c
MD5 65bf52562fdd061fab7ed17143b3e2ae
BLAKE2b-256 9b484dcfa2dea02fa45917a12cbf71d408f26fbe80f7d8b1bba08573dcc166a2

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page