Skip to main content

DreamSound Class for CNN Activation Layer Sonification

Project description

DreamSound

DreamSound is a python package for sonic deep dream generation.

Description

Inspired by the DeepDream project, DreamSound plays a sound file to Yamnet, a pre-trained neural network, and Yamnet returns a dreamed sound.

Internally, DreamSound takes the gradients of a class from the pre-trained yamnet model and filters them with an original sound with some combination technique.

Example

Head to this Google Colab for a quick example on how to get started with the module. An old version can be accessible in this other Google Colab. Yet an older version is here, which goes hand in hand with an early paper we did on the matter of Convolutional Neural Network Activation Layer Sonification, or what we called DreamSound.

Install

Dreamsound depends on the following pip packages you can pip install:

requests
numpy
matplotlib
IPython
librosa
tensorflow
soundfile

First, install the dependencies

python3 -m pip install -r requirements.txt

Install the dreamsound package using pip:

python3 -m pip install dreamsound

The pip project is hosted at PyPi: https://pypi.org/project/dreamsound/

NOTE: you need to download the yamnet model before importing the dreamsound module. Please, continue reading.

Prepare

Create a directory for your project and relocate there.

mkdir dream_test
cd dream_test

Run the Yamnet Downloader.

The Yamnet Downloader file does not come with the pip distribution. However, it is distributed on this repository. If you do not want to clone this repository, simply do: curl -O https://raw.githubusercontent.com/fdch/dreamsound/main/yamnet_downloader.py on this same directory, and run:

python3 yamnet_downloader.py

Alternatively, you can get Yamnet yourself, crudely like this:

git clone https://github.com/fdch/models.git models
mv models/research/audioset/yamnet/* .
rm -rf models
curl -O https://storage.googleapis.com/audioset/yamnet.h5

Usage example

You must have the yamnet model on the same directory. Now, you can import the dreamsound module and use the class. This code loads some files from disk and passes them to the DreamSound class from the dreamsound module. This looks something like this:

>>> import dreamsound
INFO:tensorflow:Enabling eager execution
INFO:tensorflow:Enabling v2 tensorshape
INFO:tensorflow:Enabling resource variables
INFO:tensorflow:Enabling tensor equality
INFO:tensorflow:Enabling control flow v2
>>> ds = dreamsound.DreamSound(["../audio/original.wav", "../audio/cat.wav"])
Loading audio files...
Done.
I have now 2 audio files in memory.
Using last layer.
Yamnet loaded, using layer:activation_1
Dreamer started.

Filtering

There are two types of filtering, auto or targetted:

Auto Filtering:

Filter the first audio with it's dreamed self

>>> ds(audio_index=0)
Running step 0, class: Whistling...
...
Writing ./audio/Whistle-9-orig.wav...
Writing ./audio/Whistle-9-diff.wav...
Writing ./audio/Whistle-9-filt.wav...
Writing ./audio/Whistle-9-hard.wav...
Writing ./audio/Whistle-9-grad.wav...

Targetted Filtering:

Filter the first with a dreamed target

>>> ds(audio_index=0, tgt=1)
Target class: Animal...
Running step 0, class: Whistling...
Running step 1, class: Whistle...
Running step 2, class: Whistle...
Running step 3, class: Whistle...
Running step 4, class: Whistle...
Running step 5, class: Whistle...
Running step 6, class: Whistle...
Running step 7, class: Whistle...
Running step 8, class: Flute...
Running step 9, class: Whistle...
Writing ./audio/Whistle-9-orig-tgt-Animal.wav...
Writing ./audio/Whistle-9-diff-tgt-Animal.wav...
Writing ./audio/Whistle-9-filt-tgt-Animal.wav...
Writing ./audio/Whistle-9-hard-tgt-Animal.wav...
Writing ./audio/Whistle-9-grad-tgt-Animal.wav...

Recurse

Finally, you can pass no arguments to continue filtering recursively

>>> ds()
Target class: Animal...
Running step 10, class: Whistle...
Running step 11, class: Whistle...
Running step 12, class: Wind instrument, woodwind instrument...
Running step 13, class: Wind instrument, woodwind instrument...
Running step 14, class: Flute...
Running step 15, class: Flute...
Running step 16, class: Wind instrument, woodwind instrument...
Running step 17, class: Whistle...
Running step 18, class: Whistle...
Running step 19, class: Music...
Writing ./audio/Music-19-orig-tgt-Animal.wav...
Writing ./audio/Music-19-diff-tgt-Animal.wav...
Writing ./audio/Music-19-filt-tgt-Animal.wav...
Writing ./audio/Music-19-hard-tgt-Animal.wav...
Writing ./audio/Music-19-grad-tgt-Animal.wav...

Class Variables

You can change any of the following before or after calling the class:

sr          = 22050
max_dur     = 10
patch_hop   = 0.1
win_length  = 2048
hop_length  = 128
pad_end     = False
loss_power  = 0.001
plot_every  = 10
figsize     = (10,8)
top_db      = 80.0
step_size   = 0.95
output_type = 3
steps       = 10
threshold   = 1e-07
classid     = None
maxloss     = True
elapsed     = 0
recurse     = False
target      = None
power       = 1.0
audio_dir   = "./audio/"
image_dir   = "./image/"

Authors

Fede Camara Halac (https://github.com/fdch) Matias Delgadino (https://github.com/zaytam)

Acknowledgements

YamNet AudioSet

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dreamsound-0.1.6.3.tar.gz (18.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dreamsound-0.1.6.3-py3-none-any.whl (17.4 kB view details)

Uploaded Python 3

File details

Details for the file dreamsound-0.1.6.3.tar.gz.

File metadata

  • Download URL: dreamsound-0.1.6.3.tar.gz
  • Upload date:
  • Size: 18.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.5

File hashes

Hashes for dreamsound-0.1.6.3.tar.gz
Algorithm Hash digest
SHA256 991d115da253dfc095db87456866fb90b7cc1a7838a8a5e1d6a9325abdfdef58
MD5 6d3f39d0c3880fd138bb2da5ef9be846
BLAKE2b-256 5b59409d4bd712b46e9bf0b683e0c0240cb5f0fc850e224cac4f3519ef078383

See more details on using hashes here.

File details

Details for the file dreamsound-0.1.6.3-py3-none-any.whl.

File metadata

  • Download URL: dreamsound-0.1.6.3-py3-none-any.whl
  • Upload date:
  • Size: 17.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.5

File hashes

Hashes for dreamsound-0.1.6.3-py3-none-any.whl
Algorithm Hash digest
SHA256 f91123ec78487a9dcf39079287299f729923d10a8e7fd3ef4b3f943438ce0e6d
MD5 526a6ea74f3a7fa0c5d916b0d3c816e3
BLAKE2b-256 fe26df11fbb36d6b2476d77ba73ec204bc8e6550bb8bedc10e2a2bc1e55de703

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page