Skip to main content

Frechet Audio Distance evaluation in PyTorch

Project description

fad_pytorch

Original FAD paper (PDF)

Install

pip install fad_pytorch

Features:

  • runs in parallel on multiple processors and multiple GPUs (via accelerate)
  • supports multiple embedding methods:
    • VGGish and PANN, both mono @ 16kHz
    • OpenL3 and (LAION-)CLAP, stereo @ 48kHz
  • uses publicly-available pretrained checkpoints for music (+other sources) for those models. (if you want Speech, submit a PR or an Issue; I don’t do speech.)
  • favors ops in PyTorch rather than numpy (or tensorflow)
  • fad_gen supports local data read or WebDataset (audio data stored in S3 buckets)
  • runs on CPU, CUDA, or MPS

Instructions:

This is designed to be run as 3 command-line scripts in succession. The latter 2 (fad_embed and fad_score) are probably what most people will want:

  1. fad_gen: produces directories of real & fake audio (given real data). See fad_gen documentation for calling sequence.
  2. fad_embed [options] <real_audio_dir> <fake_audio_dir>: produces directories of embeddings of real & fake audio
  3. fad_score [options] <real_emb_dir> <fake_emb_dir>: reads the embeddings & generates FAD score, for real (“$r$”) and fake (“$f$”):

$$ FAD = || \mu_r - \mu_f ||^2 + tr\left(\Sigma_r + \Sigma_f - 2 \sqrt{\Sigma_r \Sigma_f}\right)$$

Documentation

See the Documentation Website.

Comments / FAQ / Troubleshooting

  • RuntimeError: CUDA error: invalid device ordinal”: This happens when you have a “bad node” on an AWS cluster. Haven’t yet figured out what causes it or how to fix it. Workaround: Just add the current node to your SLURM --exclude list, exit and retry. Note: it may take as many as 5 to 7 retries before you get a “good node”.
  • “FAD scores obtained from different embedding methods are wildly different!” …Yea. It’s not obvious that scores from different embedding methods should be comparable. Rather, compare different groups of audio files using the same embedding method, and/or check that FAD scores go down as similarity improves.
  • “FAD score for the same dataset repeated (twice) is not exactly zero!” …Yea. There seems to be an uncertainty of around +/- 0.008. I’d say, don’t quote any numbers past the first decimal point.

Contributing

This repo is still fairly “bare bones” and will benefit from more documentation and features as time goes on. Note that it is written using nbdev, so the things to do are:

  1. Fork this repo
  2. Clone your fork to your (local) machine
  3. Install nbdev: python3 -m pip install -U nbdev
  4. Make changes by editing the notebooks in nbs/, not the .py files in fad_pytorch/.
  5. Run nbdev_export to export notebook changes to .py files
  6. For good measure, run nbdev_install_hooks and nbdev_clean - especially if you’ve added any notebooks.
  7. Do a git status to see all the .ipynb and .py files that need to be added & committed
  8. git add those files and then git commit, and then git push
  9. Take a look in your fork’s GitHub Actions tab, and see if the “test” and “deploy” CI runs finish properly (green light) or fail (red light)
  10. Once you get green lights, send in a Pull Request!

Feel free to ask me for tips with nbdev, it has quite a learning curve. You can also ask on fast.ai forums and/or fast.ai Discord

Citations / Blame / Disclaimer

This repo is 2 weeks old. I’m not ready for this to be cited in your papers. I’d hate for there to be some mistake I haven’t found yet. Perhaps a later version will have citation info. For now, instead, there’s:

Disclaimer: Results from this repo are still a work in progress. While every effort has been made to test model outputs, the author takes no responsbility for mistakes. If you want to double-check via another source, see “Related Repos” below.

Related Repos

There are [several] others, but this one is mine. These repos didn’t have all the features I wanted, but I used them for inspiration:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fad_pytorch-0.0.6.tar.gz (33.2 kB view details)

Uploaded Source

Built Distribution

fad_pytorch-0.0.6-py3-none-any.whl (32.9 kB view details)

Uploaded Python 3

File details

Details for the file fad_pytorch-0.0.6.tar.gz.

File metadata

  • Download URL: fad_pytorch-0.0.6.tar.gz
  • Upload date:
  • Size: 33.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.3

File hashes

Hashes for fad_pytorch-0.0.6.tar.gz
Algorithm Hash digest
SHA256 1a67c20597a39feaa29300f0e6996cab34ec0303c4bbd5695377c546e8eed1bd
MD5 65dc81abb674052188d36f822fb0895b
BLAKE2b-256 b4638ef510a2a01274516f120472f443245d81776ea56d6a8c65766fd7bce19d

See more details on using hashes here.

File details

Details for the file fad_pytorch-0.0.6-py3-none-any.whl.

File metadata

  • Download URL: fad_pytorch-0.0.6-py3-none-any.whl
  • Upload date:
  • Size: 32.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.3

File hashes

Hashes for fad_pytorch-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 902d940fe1d12bdc9a6b2fa4847e0b074fbf5dde8d2bd66d0eeb9dd221fa5b0a
MD5 e7fb613433224ea84ee71b51cdb87b11
BLAKE2b-256 2ee299325a1f8438fdb1942f54295d78e6912e89d299b92e51fd172fc2d9d144

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page