Skip to main content

End-to-end audio with fastai

Project description

fastproaudio

End-to-end audio with fastai.

Idea behind this repo

fastaudio focuses on spectrograms. fastai use cases tend to focus on classification. We need to go beyond those. Instead we'll focus on two things:

  1. autoregressive prediction in the time domain. We'll use an LSTM -- essentially adapting the language model lessons

  2. audio-to-audio processing/translation (e.g. audio effects). We'll use stacked 1D convolutions like a U-Net

(you probably noticed already that task #1 could be in task #2, for the case of translating to audio shifted ahead by one sample.)

"How many channels of audio are we going to use?"

That's up to the dataset! We'll try our best to assume that it's just mono.

"What other fastai datatypes/projects are relevant?"

There are three packages that are relevant for sequence modeling:

  1. fastaudio, as we mentioned, is only for spectrogram classification. The AudioBlock makes batches using an entire audio file which then gets converted to spectrograms. Instead, we want to progressively grab sequences of audio samples and as (uniform-length) chunks.

  2. The Time Series Prediction package is relevant, but the only time series output it seems to support is "univariate forecasting". Nope.

  3. Language Modeling, e.g. Chapters 10 and 12 from fastbook. Yea, that's the closest. We can treat the audio samples as if they were word vectors/embeddings: just make the tokenizer and numericalize methods to be no-ops (or we could use mu-law encoding). Nice thing is the dimensionality of the embeddings is just equal to how many channels of audio you have.

We'll use some of fastaudio but we'll also liberally rewrite/overwrite whatever we want.

...And this may not be useful to more than a few core people. We'll see. ;-)

Install

pip install fastproaudio

How to use

Workin' on it!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastproaudio-0.0.1.tar.gz (11.7 kB view details)

Uploaded Source

Built Distribution

fastproaudio-0.0.1-py3-none-any.whl (9.2 kB view details)

Uploaded Python 3

File details

Details for the file fastproaudio-0.0.1.tar.gz.

File metadata

  • Download URL: fastproaudio-0.0.1.tar.gz
  • Upload date:
  • Size: 11.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.5

File hashes

Hashes for fastproaudio-0.0.1.tar.gz
Algorithm Hash digest
SHA256 2a5cbb150eca01b42054544c20f42e52dcea0ffa85c21954994bdc392a9b55eb
MD5 fad2055ac11c27d5fa205f370beeec4b
BLAKE2b-256 fd737a28b80117f78d3ce893e915058111b97d2795facdfad762846b891dda48

See more details on using hashes here.

File details

Details for the file fastproaudio-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: fastproaudio-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 9.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.5

File hashes

Hashes for fastproaudio-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 fca2901ad814780dc0b320abf3d35369c6bc6d36b2fd38a4f38a270f5b54375c
MD5 a33566e0c31dcf9954a7da4d1caf5f29
BLAKE2b-256 7fe901507c40ec60f7abc471f16338885b10bd623ff5ddeb1b18f20abb9379fb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page