Skip to main content

Audio metadata explorer and analysis tool, like exiftool but for audio

Project description

acidcat logo

acidcat

Audio metadata explorer and analysis tool -- like exiftool, but for audio.

Reads BPM, key, duration, tags, and format info from WAV, AIFF, MP3, FLAC, OGG, Opus, M4A, MIDI, and Serum presets. Also structurally decodes Bitwig (.bwpreset/.bwclip), Native Instruments (Massive/Absynth/Kontakt/NKS/KORE), Vital, NCW, and MP4 containers via inspect. Zero dependencies for core metadata. Optional librosa analysis for BPM/key detection and ML feature extraction.

Also ships per-library SQLite indexes (acidcat index) tracked in a small global registry, plus an MCP server (acidcat-mcp) so an LLM can query your whole collection across libraries by bpm, key, tags, or full-text.

Install

git clone https://github.com/hed0rah/acidcat.git
cd acidcat
pip install -e .                # core + mutagen (WAV/AIFF/MIDI/Serum/MP3/FLAC/OGG/Opus/M4A)
pip install -e .[analysis]      # + librosa BPM/key detection
pip install -e .[ml]            # + sklearn similarity/clustering
pip install -e .[mcp]           # + MCP server (acidcat-mcp, stdio)
pip install -e .[mcp-http]      # + MCP streamable-HTTP transport (acidcat-mcp --transport http)
pip install -e .[all]           # everything

Quick Start

# single file -- instant metadata
acidcat kick_808.wav
acidcat loop.mp3
acidcat pad.flac

# pipe from stdin
cat file.wav | acidcat
curl https://example.com/loop.mp3 | acidcat -

# JSON output for piping
acidcat kick_808.wav -f json | jq .BPM

# deep analysis with librosa
acidcat kick_808.wav --deep

# scan a mixed-format directory
acidcat scan ~/Samples/Breaks -n 200

Supported Formats

Format Extension What acidcat reads
WAV .wav BPM, key, loop points, beats, ACID/SMPL chunks, LIST/INFO
AIFF .aif Duration, format, name, author, copyright, markers
MP3 .mp3 BPM, key, title, artist, album, genre, comment (ID3v2)
FLAC .flac BPM, key, title, artist, album, genre (Vorbis Comment)
OGG .ogg BPM, key, title, artist, album, genre (Vorbis Comment)
Opus .opus BPM, key, title, artist (Vorbis Comment)
M4A .m4a BPM, key, title, artist, album, genre (iTunes atoms)
MIDI .mid BPM, key sig, time sig, tracks, note count/range
Serum .SerumPreset Preset name, author, tags, description
Bitwig .bwpreset, .bwclip Device tree, parameters, clip notes (inspect + index)
Native Instruments .nmsv, .nabs, .ksd, .nksf, .nki Preset metadata, NKS tags, FastLZ subtree (inspect + index)
Vital .vital Patch name, author, tags, modulation matrix (inspect + index)
NCW .ncw NI Compressed Wave header, channel/block info (inspect)
MP4 .mp4 Box tree, codec info, iTunes tags (inspect)

Commands

Command Description
acidcat FILE Show metadata for a single file (auto-detected)
acidcat DIR Batch-scan a directory (auto-detected)
acidcat - Read from stdin
acidcat info FILE Explicit single-file metadata dump
acidcat scan DIR Batch-scan with CSV output
acidcat chunks FILE Walk RIFF chunks -- offsets, sizes, parsed fields
acidcat survey DIR Count chunk types across a directory tree
acidcat detect FILE|DIR Estimate BPM/key using librosa
acidcat features DIR Extract 50+ audio features for ML
acidcat similar CSV find TARGET Find similar samples by features
acidcat similar CSV cluster Cluster samples by audio characteristics
acidcat search CSV query TEXT Text-based sample search (legacy CSV)
acidcat dump FILE CHUNK [...] Hex-dump specific RIFF chunks
acidcat inspect FILE... [--hex] [--frames] [--only/--exclude IDS] [--full] [--pretty] [--color] readelf-style structural dump (WAV, RF64, AIFF, MIDI, Serum, MP3, FLAC, OGG, MP4/M4A, Bitwig, Vital, NCW, Native Instruments (Massive/Absynth/Kontakt/NKS/KORE)) with lint warnings. Takes multiple files (each under a File: banner; JSON becomes NDJSON). --frames per-frame/event dump, --only/--exclude select chunks, --hex raw bytes, --full a self-contained JSON dump for build_explorer.py, --pretty a human-friendly metadata view, --verbose a deep deconstruction (Bitwig device tree/parameters/notes, Vital modulation matrix, ...), --color to syntax-highlight
acidcat index DIR Upsert DIR into the global SQLite index
acidcat query [flags] Filter the global index by bpm/key/tag/text
acidcat convert clip.bwclip -o out.mid Export a DAW clip's notes to a Standard MIDI File
acidcat write FILE --set field=value Edit metadata in place (exiftool-style: _original backup, -o copy, --dry-run; custom frames via txxx:NAME=value)
acidcat cover FILE [-o art.jpg] [--set img] [--remove] Extract, embed, or remove embedded cover art (MP3/FLAC/MP4/Ogg)
acidcat explore FILE [-o out.html] Build a standalone interactive HTML byte-explorer (hex grid + tinted fields + LSB heat-map)

Global Flags

-f, --format FMT                Output format (default varies by command)
-o, --output FILE               Write output to file
-q, --quiet                     Suppress progress output
-v, --verbose                   Extra detail
-n, --num N                     Max files to scan (default: 500)
--has CHUNKS                    Filter by chunk IDs (comma-separated)
--deep                          Include librosa analysis

Most commands accept table, json, and csv (default table, but scan and features default to csv). Two differ: inspect is table/json, and dump is hex/json.

Dependency Groups

Group What it adds Commands enabled
(none) mutagen (base) info, scan, chunks, survey, dump for WAV/AIFF/MIDI/Serum/MP3/FLAC/OGG/Opus/M4A
[analysis] librosa, numpy, scipy, soundfile detect, info --deep
[ml] + pandas, scikit-learn features, similar, search
[viz] + matplotlib, seaborn optional plotting
[notebook] + jupyter, ipykernel optional notebook env
[mcp] mcp SDK acidcat-mcp stdio server
[mcp-http] starlette + uvicorn acidcat-mcp --transport http (streamable-HTTP transport)
[all] everything (includes [mcp-http]) all commands, all formats

Examples

Metadata Exploration

# what chunks exist in your sample library?
acidcat survey ~/Samples/Loops -n 5000

# walk all chunks in a specific file
acidcat chunks ~/Samples/Loops/breakbeat.wav

# hex-dump the ACID and SMPL chunks
acidcat dump ~/Samples/Loops/breakbeat.wav acid smpl

# scan only files with ACID metadata
acidcat scan ~/Samples/Loops --has acid -n 200

# scan a directory with mixed formats (WAV, MP3, FLAC, etc.)
acidcat scan ~/Samples -n 500

BPM / Key Detection

# estimate BPM/key with librosa (for files without metadata)
acidcat detect ~/Samples/OneShots

# scan with librosa fallback for missing metadata
acidcat scan ~/Samples/Loops --fallback -n 100

ML Feature Extraction

# extract 50+ audio features to CSV
acidcat features ~/Samples/Loops -n 500

# generate normalized (StandardScaler) ML-ready dataset
acidcat features ~/Samples/Loops --ml-ready -n 500

Similarity & Clustering

# find 5 samples similar to index 0
acidcat similar features.csv find 0 -n 5

# k-means clustering
acidcat similar features.csv cluster -k 10 -o clustered.csv

Libraries (per-directory indexes)

acidcat scan writes a one-off CSV. acidcat index is the persistent path: each directory you index becomes a library with its own SQLite file, and a small global registry at ~/.acidcat/registry.db lets reads fan out across every library you have registered.

By default the per-library DB lives centrally at ~/.acidcat/libraries/<label>_<hash>.db. Pass --in-tree if you'd rather have the DB travel with the data at <library>/.acidcat/index.db.

# register and index a library (label defaults to basename of DIR)
acidcat index ~/Samples/Loops --label loops
acidcat index ~/Samples/OneShots --label oneshots

# show every registered library
acidcat index --list

# per-library stats
acidcat index --stats loops

# extract librosa features during indexing (slower, enables similarity)
acidcat index ~/Samples/Loops --label loops --features

# rebuild a library's DB from scratch
acidcat index ~/Samples/Loops --label loops --rebuild

# forget a library (registry only) vs remove it (deletes the DB file)
acidcat index --forget loops
acidcat index --remove loops

# list registered libraries whose DB file is missing on disk
acidcat index --orphans

# import a legacy <name>_tags.json into a library
acidcat index ~/Samples --label samples --import-tags old_tags.json

Nested libraries are rejected at registration time: if you've registered ~/Samples, you can't also register ~/Samples/Loops until you forget the parent.

Discovery

For users with many scattered packs, --discover walks a tree and registers every qualifying subdirectory as its own library in one pass.

# preview what would get registered (no writes)
acidcat index --discover ~/Samples --dry-run

# actually register them
acidcat index --discover ~/Samples

# tighter threshold and namespacing for a subset of your collection
acidcat index --discover /mnt/external/old_drives \
              --min-samples 50 --label-prefix "ext_"

A directory qualifies if its subtree (within --max-depth, default 3) contains at least --min-samples audio files (default 20). Non- qualifying parents are recursed into so packs nested inside catch-all folders still surface. Already-registered roots are skipped. The home directory is refused as a discover root to prevent runaway registration.

Querying

By default acidcat query fans out across every registered library and merges the results.

acidcat query --bpm 120:130 --key Am
acidcat query --tag drums --tag punchy --duration :1
acidcat query --text "dusty lofi" --limit 20
acidcat query --format mp3 --root loops
acidcat query --root loops,oneshots --bpm 128
acidcat query --bpm 128 --paths-only | xargs -I {} cp {} out/

--root accepts a label, an absolute path, or a comma-separated list. Override the registry on any command with --registry PATH or the ACIDCAT_REGISTRY environment variable.

MCP Server

acidcat-mcp is a stdio MCP server that exposes the registered libraries as structured tools. An LLM can ask "what libraries do I have?", search across them by metadata, find compatible keys via Camelot, or (with [analysis] installed) find similar samples by librosa feature cosine.

pip install -e .[mcp]            # minimum for discovery + writes
pip install -e .[analysis,mcp]   # unlock find_similar / analyze_*

Claude Desktop / Claude Code config:

{
  "mcpServers": {
    "acidcat": {
      "command": "acidcat-mcp"
    }
  }
}

Optional: pass --registry PATH on the server process or set ACIDCAT_REGISTRY if your registry lives outside the default location.

Tool tiers (each tool description starts with Fast., SLOW., or VERY SLOW. so the model self-selects):

  • Fast (SQLite only): search_samples, get_sample, locate_sample, list_libraries, list_tags, list_keys, list_formats, index_stats, find_compatible
  • Slow analysis (needs [analysis]): find_similar, analyze_sample, detect_bpm_key
  • Index management: reindex, reindex_features, discover_libraries
  • Write (marked destructive): register_library, forget_library, tag_sample, set_sample_description

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

acidcat-0.15.0.tar.gz (233.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

acidcat-0.15.0-py3-none-any.whl (197.7 kB view details)

Uploaded Python 3

File details

Details for the file acidcat-0.15.0.tar.gz.

File metadata

  • Download URL: acidcat-0.15.0.tar.gz
  • Upload date:
  • Size: 233.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for acidcat-0.15.0.tar.gz
Algorithm Hash digest
SHA256 8a2d58e2fd8df27e2df1a38507a6438d520fc9fde02342e83e8c762234699541
MD5 e531333475b55c3f3b4d524da59c9773
BLAKE2b-256 4c0f005010e572ae9a0dc85e9203831d2ccbd545c248caf5cee55d227fc573a9

See more details on using hashes here.

Provenance

The following attestation bundles were made for acidcat-0.15.0.tar.gz:

Publisher: publish.yml on hed0rah/acidcat

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file acidcat-0.15.0-py3-none-any.whl.

File metadata

  • Download URL: acidcat-0.15.0-py3-none-any.whl
  • Upload date:
  • Size: 197.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for acidcat-0.15.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bff6fd9f992d42839c62fcbc957b2683e05bcb0022168cef3eb56bf06ef0ac21
MD5 99834e8d6637360a20c535a35721e2e9
BLAKE2b-256 c7e52920d837aaf0becac6bf20ca816af957ad58b5465fd8be8f9063f5339f62

See more details on using hashes here.

Provenance

The following attestation bundles were made for acidcat-0.15.0-py3-none-any.whl:

Publisher: publish.yml on hed0rah/acidcat

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page