Skip to main content

Generate soundscapes based on images.

Project description

Soundscape Generation

Table of Contents

  1. Installation
  2. Usage
  3. References

Installation

Scaper Installation

The sound generation module was developed using Scaper. Given a collection of isolated sound events, Scaper acts as a high-level sequencer that can generate multiple soundscapes from a single probabilistically defined specification.

Follow the instructions give in the following link:

Download Dependencies

pip install -r requirements.txt

Download Cityscapes Dataset

To download the dataset, a cityscapes account is required for the authentification. Such an account can be created on www.cityscapes-dataset.com. After the registration, run the download_data.sh script. During the download, it will ask you to provide your email and password for authentification.

./download_data.sh

Usage

For the object detection module a pre-trained ERFNet is used, which is then finetuned on the Cityscapes dataset.

Train Object Segmentation Network

To train the network, run the follwing command.

python train.py --num_epochs 70 --batch_size 8 --evaluate_every 1 --save_weights_every 1

By default, training resumes from the latest saved checkpoint. If the checkpoints/ directory is missing, the training starts from scratch.

Test the Segmentation Network

Run the following command to predict the semantic segmentation of every image in the test_images/ directory (note: results are saved in the test_segmentations/ directory)

python predict.py

Ensure that you specify the image's file type in the image path variable in predict.py.

Generate soundscapes

Run the file soundGeneration.py to generate soundscapes of every image in the test_images/ directory (note: results are saved in the soundscapes/ directory). Ensure that you specify the image type of the image in the image path variable of predict.py.

Results

Object Detection

The above predictions are produced by a network trained for 67 epochs that achieves a mean class IoU score of 0.7084 on the validation set. The inference time on a Tesla P100 GPU is around 0.2 seconds per image. The model was trained for 70 epochs on a single Tesla P100. After the training, the checkpoint that yielded to highest validation IoU score was selected. The progression of the IoU metric is shown below.

References

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

soundscape-generation-0.1.0.tar.gz (14.9 kB view details)

Uploaded Source

Built Distribution

soundscape_generation-0.1.0-py3-none-any.whl (17.0 kB view details)

Uploaded Python 3

File details

Details for the file soundscape-generation-0.1.0.tar.gz.

File metadata

  • Download URL: soundscape-generation-0.1.0.tar.gz
  • Upload date:
  • Size: 14.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.8.11

File hashes

Hashes for soundscape-generation-0.1.0.tar.gz
Algorithm Hash digest
SHA256 68b11d5dc311edc0f0a5a17f02daebe0a5dd3d52624c3f15ab23012ca629a41b
MD5 1a6cf49580314145bf173258a087e8a8
BLAKE2b-256 ae7e96ace4a9c3fa455dd93c17a599e9ae678a50a57a8dca71aa1df5fb0aec44

See more details on using hashes here.

File details

Details for the file soundscape_generation-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: soundscape_generation-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 17.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.8.11

File hashes

Hashes for soundscape_generation-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1ad6518ab766cdc116779b3df200797c27dae47439f6626804be1c75629f321e
MD5 9b8062993e165beee0704857f91adfa0
BLAKE2b-256 b3327bdd67ee1ed20fe1564ed88e73f22543d610d811743f51234f14631a7b24

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page