Skip to main content

A python package for neural audio effect

Project description

PyNeuralFx

A toolkit for neural audio effect modeling. Users can conduct experiments conveniently with this toolkit for audio effect modeling.

PyNeuralFx paper

[Powered by PyNeuralFx]

Installation

You can install PyNeuralFx via

$ pip install pyneuralfx

By using the frame work, you cam

$ git clone https://github.com/ytsrt66589/pyneuralfx.git

then

$ cd frame_work/ 

Motivation

Due to the rising importance of the audio effect related task, several easy-to-use toolkits are developed. For example, dasp-package (repo_link) for differetntiable signal processing in pytorch and grafx (ref_link) for audio effect processing graph in pytorch. However, there is no easy-to-use toolkit for neural audio effect modeling task, especially for the black-box method. PyNeuralFx aims to overcome this issue, helping beginners easily start the neural audio effect modeling research and inspiring experienced researchers with different aspects.

Tutorials

You can find out the tutorial in /tutorials.

Functionality

Neural Network Models

PyNeuralFx follows the naming principles: [control-model] For example, if we use the concat as the conditioning method and the gru as the model, then the model called: Concat-GRU. If the model is only for snapshot modeling, then the control will always be snapshot

PyNeuralFx supports:

snaoshot modeling

  • snapshot-tcn
  • snapshot-gcn
  • snapshot-vanilla-rnn
  • snapshot-lstm
  • snapshot-gru
  • snapshot-tfilm-gcn (ref_link)
  • snapshot-tfilm-tcn (ref_link)

full modeling

  • concat-gru
  • film-gru
  • statichyper-gru
  • dynamichyper-gru
  • concat-lstm
  • film-lstm
  • statichyper-lstm
  • dynamichyper-lstm
  • film-vanilla-rnn
  • statichyper-vanilla-rnn
  • concat-gcn
  • film-gcn
  • hyper-gcn
  • concat-tcn
  • film-tcn
  • hyper-tcn
  • film-ssm (ref_link)

Loss functions

In our opinion, loss functions often aim for different purposes. Some are for the reconstuction loss (Overall reconstruction), some are for eliminateing specific problems (Improving sound details), and some are for leveraging perceptual properties (Aligning human perceptual). More research are needed for the exploration of the loss of different audio effects.

PyNeuralFx supports:

  • esr loss
  • l1 loss
  • l2 loss
  • complex STFT loss
  • multi-resolution complex STFT loss
  • STFT loss
  • multo-resolution STFT loss
  • dc eliminating loss
  • shot-time energy-loss (ref_link)
  • adversarial loss (ref_link, ref_link)

Also, PyNeuralFx supports

Evaluation metrics

The loss functions used above can be used as the evaluation metric also, for estimation the reconstruction error. Moreover, PyNeuralFx also supports other metrics for comprehensive evaluation:

Notice Due to the original implementation of the transient extraction is slow, we implement another implementation of the transient extraction. Users can experiment with those two methods and compare the difference.

Visualization

PyNeuralFx supports two types of visualization:

  • Wave file comparison
    • time-domain wave comparison
    • spectrum difference
  • Model's behavior visualization
    • harmonic response
    • distortion curve
    • sine sweep visualization (for observing the aliasing problem)
    • phase response

Training Frame Work Usage Flow

image info

(First run the command cd frame_work. Ensure that the working directory is frame_work)

  1. Download dataset: Download dataset from the commonly used academic paper or prepare the dataset by yourself. Then put the data under the folder data. Current supported dataset is listed below sections, for the supported dataset, we provide the preprocess file to match the data template we expected.
  2. Preprocess data: Write your own code or manually to match the data template we expected for using the frame work provided in pyneuralfx. Please refer to dataset.md section for more details. If you use the dataset pyneuralfx supported then the preprocess file is already provided in preprocess/{name_of_the_dataset}.py.
  3. Prepare Configuration: modify the configuration files in configs/. All experiments are record by configuration file to ensure the reproducibility. Further detail of configuration setting is shown in configuration.md.
  4. Training: run the code to train the model depends on the configuration files. Please refer to train.md for more details.
  5. Evaluation & Visualization: evaluate your results by several metrics or visualize the comparison or important system properties. Please refer to evalvis.md for more details.

Tricks

  1. During training, you can use loss_analysis/compare_loss.py to check to validation loss curve. (Remember to modify the experiment root in compare_loss.py)

Supported Dataset

Those datasets are collected from previous works, if you use them in your paper or in your project, please cite the corresponding paper.

Contributing

Interested in contributing? Check out the contributing guidelines. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.

License

pyneuralfx was created by yytung. It is licensed under the terms of the MIT license.

Credits

pyneuralfx was created with cookiecutter and the py-pkgs-cookiecutter template.

This project is highly inspired by the following repositories, thanks to the amazaing works they have done. If you are interested in the audio effect related works, please look at the following repositories or websites to gain more insights.

  • micro-tcn (link)
  • gcn-tfilm ((link))
  • pyloudnorm (link)
  • ddsp-singing-vocoder (link)
  • Binaural Speech Synthesis (link)
  • GreyBoxDRC (link)
  • sms-tools (link)
  • DeepAFx-ST (link)
  • Audio DSPy (link)
  • Jatin Chowdhurry medium (link)
  • Hyper LSTM (link)
  • GuitarML (link)
  • SFI source separation (link)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyneuralfx-0.1.2.tar.gz (39.7 kB view details)

Uploaded Source

Built Distribution

pyneuralfx-0.1.2-py3-none-any.whl (49.7 kB view details)

Uploaded Python 3

File details

Details for the file pyneuralfx-0.1.2.tar.gz.

File metadata

  • Download URL: pyneuralfx-0.1.2.tar.gz
  • Upload date:
  • Size: 39.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.6 Linux/6.5.0-27-generic

File hashes

Hashes for pyneuralfx-0.1.2.tar.gz
Algorithm Hash digest
SHA256 df458aac713ce8e34999a4b3186ecc5b3bdfe684a6cbdaa8e3d7670f468db020
MD5 07f9fae4e6a0c4dce508263b78888d28
BLAKE2b-256 fbb2a702a341a1f6ee31a44271020eac03b62daecfc428236d603b9ea7316fc7

See more details on using hashes here.

File details

Details for the file pyneuralfx-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: pyneuralfx-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 49.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.6 Linux/6.5.0-27-generic

File hashes

Hashes for pyneuralfx-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 c16ef229a6dde212b694116e2756bf90b9197d590a744621e449e22a42c375e1
MD5 8303163276dfe8b26429b619556ef7b9
BLAKE2b-256 0ea0d26e8c2caa9dc5cc1f4fb549604d0f1b6b0e5f494794016898e496ced794

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page