Skip to main content

This package aims at simplifying the download of the AudioCaps dataset.

Project description

AudioCaps Download

DISCLAIMER: This repository is a modified version of the AudioSet Download repository.

This repository contains code for downloading the AudioCaps dataset. The repository is not officially affiliated with the AudioCaps dataset.

Requirements

  • Python 3.9 (it may work with other versions, but it has not been tested)

Installation

# Install ffmpeg
sudo apt install ffmpeg
# Install audiocaps-download
pip install audiocaps-download

Usage

The following code snippet downloads the complete dataset in WAV format, and stores it in the test directory.

from audiocaps_download import Downloader
d = Downloader(root_path='audiocaps/', n_jobs=16)
d.download(format = 'wav') # it will cross-check the files with the csv files in the original repository

Download updated Dec 2023

As of December 2023, the repository has been used to download the dataset in WAV format. The following table shows the number of files downloaded.

Difference between the original training set and the updated one: 4651 Difference between the original validation set and the updated one: 255 Difference between the original test set and the updated one: 460 Training set: 45187 Validation set: 2220 Test set: 4415

Split Number of files (using this repo) Number of files (original)
Training 45187 49838
Validation 2220 2475
Test 4415 4875

The missing files may be due to the missing youtube videos, please open an issue here if you find any bug.

Implementation

The main class is audiocaps_download.Downloader. It is initialized using the following parameters:

  • root_path: the path to the directory where the dataset will be downloaded.
  • n_jobs: the number of parallel downloads. Default is 1.

The methods of the class are:

  • download(format='vorbis', quality=5): downloads the dataset.
  • The format can be one of the following (supported by yt-dlp --audio-format parameter):
    • vorbis: downloads the dataset in Ogg Vorbis format. This is the default.
    • wav: downloads the dataset in WAV format.
    • mp3: downloads the dataset in MP3 format.
    • m4a: downloads the dataset in M4A format.
    • flac: downloads the dataset in FLAC format.
    • opus: downloads the dataset in Opus format.
    • webm: downloads the dataset in WebM format.
    • ... and many more.
    • The quality can be an integer between 0 and 10. Default is 5.
  • load_dataset(): reads the csv files from the original repository. It is not used externally.
  • download_file(...): downloads a single file. It is not used externally.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

audiocaps-download-1.1.tar.gz (5.1 kB view details)

Uploaded Source

Built Distribution

audiocaps_download-1.1-py3-none-any.whl (5.5 kB view details)

Uploaded Python 3

File details

Details for the file audiocaps-download-1.1.tar.gz.

File metadata

  • Download URL: audiocaps-download-1.1.tar.gz
  • Upload date:
  • Size: 5.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for audiocaps-download-1.1.tar.gz
Algorithm Hash digest
SHA256 81d8e22997a0eef4b226490fc990731476e73d2089fb734b6c94d5343ad1deac
MD5 9d06826367d79f058775bc655c7c2fe8
BLAKE2b-256 750f1d37513818c3f1891f3e6608c88ddf48e55eab0fe369bf84759c291693a1

See more details on using hashes here.

File details

Details for the file audiocaps_download-1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for audiocaps_download-1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 004bc80032516a3931197569982f1cf6792eed1d3b714a5df0a63f4d4e078de1
MD5 d1606f7707bea351ac26e3d8974d72c4
BLAKE2b-256 031a4a32c50c6524de6d27359dc7efca27b96f918a9402aa28f22c65644ccde7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page