End-to-end audio with fastai
Project description
fastproaudio
End-to-end audio with fastai.
Idea behind this repo
fastaudio
focuses on spectrograms. fastai
use cases tend to focus on classification. We need to go beyond those. Instead we'll focus on two things:
-
autoregressive prediction in the time domain. We'll use an LSTM -- essentially adapting the language model lessons
-
audio-to-audio processing/translation (e.g. audio effects). We'll use stacked 1D convolutions like a U-Net
(you probably noticed already that task #1 could be in task #2, for the case of translating to audio shifted ahead by one sample.)
"How many channels of audio are we going to use?"
That's up to the dataset! We'll try our best to assume that it's just mono.
"What other fastai datatypes/projects are relevant?"
There are three packages that are relevant for sequence modeling:
-
fastaudio
, as we mentioned, is only for spectrogram classification. TheAudioBlock
makes batches using an entire audio file which then gets converted to spectrograms. Instead, we want to progressively grab sequences of audio samples and as (uniform-length) chunks. -
The Time Series Prediction package is relevant, but the only time series output it seems to support is "univariate forecasting". Nope.
-
Language Modeling, e.g. Chapters 10 and 12 from fastbook. Yea, that's the closest. We can treat the audio samples as if they were word vectors/embeddings: just make the tokenizer and numericalize methods to be no-ops (or we could use mu-law encoding). Nice thing is the dimensionality of the embeddings is just equal to how many channels of audio you have.
We'll use some of fastaudio but we'll also liberally rewrite/overwrite whatever we want.
...And this may not be useful to more than a few core people. We'll see. ;-)
Install
pip install fastproaudio
How to use
Workin' on it!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file fastproaudio-0.0.1.tar.gz
.
File metadata
- Download URL: fastproaudio-0.0.1.tar.gz
- Upload date:
- Size: 11.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2a5cbb150eca01b42054544c20f42e52dcea0ffa85c21954994bdc392a9b55eb |
|
MD5 | fad2055ac11c27d5fa205f370beeec4b |
|
BLAKE2b-256 | fd737a28b80117f78d3ce893e915058111b97d2795facdfad762846b891dda48 |
File details
Details for the file fastproaudio-0.0.1-py3-none-any.whl
.
File metadata
- Download URL: fastproaudio-0.0.1-py3-none-any.whl
- Upload date:
- Size: 9.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fca2901ad814780dc0b320abf3d35369c6bc6d36b2fd38a4f38a270f5b54375c |
|
MD5 | a33566e0c31dcf9954a7da4d1caf5f29 |
|
BLAKE2b-256 | 7fe901507c40ec60f7abc471f16338885b10bd623ff5ddeb1b18f20abb9379fb |