Skip to main content

Audio methods for pandas dataframes using soundfile

Project description

Pandas Audio Methods

Audio methods for pandas dataframes using soundfile.

Features:

  • Use sf.SoundFile objects in pandas dataframes
  • Call sf.SoundFile methods on a column, for example:
    • .read()
    • .truncate()
    • .seek()
  • Save dataframes with sf.SoundFile objects to Parquet
  • Process Audios in parallel with Dask
  • Manipulate Audio datasets from Hugging Face

Installation

pip install pandas-audio-methods

Usage

You can open Audios as sf.SoundFile objects using the .open() method.

Once the Audios are opened, you can call any sf.SoundFile method:

import pandas as pd
from pandas_audio_methods import SFMethods

pd.api.extensions.register_series_accessor("sf")(SFMethods)

df = pd.DataFrame({"file_path": ["path/to/audio.wav"]})
df["audio"] = df["file_path"].sf.open()
# 0    SoundFile('path/to/audio.wav', mode='r', sampl...
# Name: audio, dtype: object, soundfile methods enabled

Use with librosa:

import librosa
df["audio"] = [librosa.load(audio, sr=16_000) for audio in df["audio"]]
df["audio"] = df["audio"].sf.write()
# 0    SoundFile(<_io.BytesIO object at 0x11b747ba0>,...
# Name: audio, dtype: object, soundfile methods enabled

Here is how to enable sf methods for sf.SoundFiles created manually:

df = pd.DataFrame({"audio": [sf.SoundFile.open("path/to/audio.wav")]})
df["audio"] = df["audio"].sf.enable()
# 0    SoundFile('path/to/audio.wav', mode='r', sampl...
# Name: Audio, dtype: object, soundfile methods enabled

Save

You can save a dataset of sf.SoundFiles to Parquet:

# Save
df = pd.DataFrame({"file_path": ["path/to/audio.wav"]})
df["audio"] = df["file_path"].sf.open()
df.to_parquet("data.parquet")

# Later
df = pd.read_parquet("data.parquet")
df["audio"] = df["audio"].sf.enable()

This doesn't just save the paths to the Audio files, but the actual Audios themselves !

Under the hood it saves dictionaries of {"bytes": <bytes of the Audio file>, "path": <path or name of the Audio file>}. The Audios are saved as bytes using their Audio encoding or PNG by default. Anyone can load the Parquet data even without pandas-audio-methods since it doesn't rely on extension types.

Note: if you created the sf.SoundFiles manually, don't forget to enable the sf methods to enable saving to Parquet.

Run in parallel

Dask DataFrame parallelizes pandas to handle large datasets. It enables faster local processing with multiprocessing as well as distributed large scale processing. Dask mimics the pandas API:

import dask.dataframe as dd
from distributed import Client
from pandas_audio_methods import SFMethods

dd.extensions.register_series_accessor("sf")(SFMethods)

if __name__ == "__main__":
    client = Client()
    df = dd.read_csv("path/to/large/dataset.csv")
    df = df.repartition(npartitions=1000)  # divide the processing in 1000 jobs
    df["audio"] = df["file_path"].sf.open()
    df["audio"].head(1)
    # 0    SoundFile('path/to/audio.wav', mode='r', sampl...
    # Name: audio, dtype: object, soundfile methods enabled
    df.to_parquet("data_folder")

Hugging Face support

Most Audio datasets in Parquet format on Hugging Face are compatible with pandas-audio-methods. For example you can load the microset of the People's Speech dataset:

df = pd.read_parquet("hf://datasets/MLCommons/peoples_speech/microset/train-00000-of-00001.parquet")
df["audio"] = df["audio"].sf.enable()

You can also use the datasets library, here is an example on the jlvdoorn/atco2-asr dataset for automatic speech recognition:

from datasets import load_dataset

df = load_dataset("jlvdoorn/atco2-asr", split="train").to_pandas()
df["audio"] = df["audio"].sf.enable()

Datasets created with pandas-audio-methods and saved to Parquet are also compatible with the Dataset Viewer on Hugging Face and the datasets library:

df.to_parquet("hf://datasets/username/dataset_name/train.parquet")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandas_audio_methods-0.0.1.tar.gz (8.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pandas_audio_methods-0.0.1-py3-none-any.whl (10.1 kB view details)

Uploaded Python 3

File details

Details for the file pandas_audio_methods-0.0.1.tar.gz.

File metadata

  • Download URL: pandas_audio_methods-0.0.1.tar.gz
  • Upload date:
  • Size: 8.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.5 CPython/3.12.2 Darwin/23.4.0

File hashes

Hashes for pandas_audio_methods-0.0.1.tar.gz
Algorithm Hash digest
SHA256 ff2447f1f2f0429b4e365b9cc63852453292026bdc97d7a8607606c2337ead36
MD5 e91a7b3a7a9bea8655dccb9719c674a3
BLAKE2b-256 5f8cbd3aa2ec4729e93a4a37cd99c1448a2efe7a136f3c7ea4581efc2addb160

See more details on using hashes here.

File details

Details for the file pandas_audio_methods-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for pandas_audio_methods-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5fb62d7be344803e1ae0881d7cc905626c2ac7140042dabc56bb499ceea3170b
MD5 af754aa763a651e1bc2b61af504ff23c
BLAKE2b-256 e7bc296ee2d31ad80de2d713dfd72c429578f11d142da4f32162f9b8c4db60d5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page