Fx-Encoder++ for audio effects representation

Project description

Fx-Encoder++

Convert audio effects from your music into encoded representations suitable for audio effects processing and analysis tasks.

Paper HugginFace

About Fx-Encoder++

We adopt the codebase of CLAP for this project.

An audio effects representation learning based on SimCLR.

Architecture

fxencoder_plusplus

Usage

Installation

pip install fxencoder_plusplus

Usage

Notice: The input to Fx-Encoder++ should be stereo

Initialize Models

from fxencoder_plusplus import load_model 

# Load default base model (auto-downloads if needed)
DEVICE = 'cuda'
model = load_model(
    'default',
    device=DEVICE,
)

Extract audio effects representations from mixture tracks or stem tracks, where a single representation encodes the overall audio effects style of the entire input.

import torch 
import librosa 
audio_path = librosa.example('trumpet')
wav, sr = librosa.load(audio_path, sr=44100, mono=False)
wav = torch.from_numpy(wav).unsqueeze(0).unsqueeze(0).repeat(1, 2, 1).to(DEVICE) # [1, 2, seq_len]

fx_emb = model.get_fx_embedding(wav)
print(fx_emb.shape) # [1, embed_dim], [1, 128]

## if you want to get the embedding before projection, then 
fx_emb = model.get_fx_embedding(wav, normalized=False)
print(fx_emb.shape) # [1, embed_dim], [1, 2048]

Extract instrument-specific audio effects representations from mixture tracks. For example, extract the audio effects representation of just the vocals within a full mix.

Audio Reference:

import torchaudio 
import julius 
mixture_path = "/path/to/mixture.wav"
mixture, sr = torchaudio.load(mixture_path, num_frames=441000)
mixture = mixture.unsqueeze(0).to(DEVICE) # [1, channel, seq_len]

query_path = "/path/to/inst.wav"
query, sr = torchaudio.load(query_path, frame_offset=441000, num_frames=441000)
query = query.unsqueeze(0).to(DEVICE) # [1, channel, seq_len]
query = julius.resample_frac(query, int(44100), int(48000))

_, fx_emb = model.get_fx_embedding_by_audio_query(mixture, query)
print(fx_emb.shape) # [1, embed_dim], [1, 128]

Text Reference:

import torchaudio 
mixture_path = "/path/to/mixture.wav"
mixture, sr = torchaudio.load(mixture_path, num_frames=441000)
mixture = mixture.unsqueeze(0).to(DEVICE) # [1, channel, seq_len]

query = "the sound of vocals"

_, fx_emb = model.get_fx_embedding_by_text_query(mixture, query)
print(fx_emb.shape) # [1, embed_dim], [1, 128]

Training

Env

Create environment with conda

conda create --name fxenc python=3.10.14

Install

pip install -r requirements.txt

Prepare Fx-Normalized Dataset

Because the dataset has copyright restriction, unfortunatly we cannot directly share preprocessed datasets.

Download MUSDB, MoisesDB
Please check FxNorm-automix for preparing audio effects normalized dataset

Run

bash scripts/train_proposed.sh

Evaluation

We develop a retrieval-based evaluation pipeline (Using MUSDB dataset as the example)

Check FxNorm-automix for preparing audio effects normalized dataset
Synthesize evaluation dataset: check build_musdb.py
Run retrieval-based evaluation: check eval_retrieval.py

LICENSE

This library is released under the CC BY-NC 4.0 license. Please refer to the LICENSE file for more details.

Project details

Release history Release notifications | RSS feed

This version

0.1.5

Aug 23, 2025

0.1.4

Jun 19, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fxencoder_plusplus-0.1.5.tar.gz (16.2 kB view details)

Uploaded Aug 23, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fxencoder_plusplus-0.1.5-py3-none-any.whl (15.0 kB view details)

Uploaded Aug 23, 2025 Python 3

File details

Details for the file fxencoder_plusplus-0.1.5.tar.gz.

File metadata

Download URL: fxencoder_plusplus-0.1.5.tar.gz
Upload date: Aug 23, 2025
Size: 16.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.14

File hashes

Hashes for fxencoder_plusplus-0.1.5.tar.gz
Algorithm	Hash digest
SHA256	`3bbd6d703e28f17e554f2f5768b732915b941e95813d8896d70324ebce38afc1`
MD5	`bf9c7041c2fa183606203b3ba1c263d4`
BLAKE2b-256	`8106d7ab66141e90643bd2f84a0cc6a06b1707a525e1641c38a8d89277963494`

See more details on using hashes here.

File details

Details for the file fxencoder_plusplus-0.1.5-py3-none-any.whl.

File metadata

Download URL: fxencoder_plusplus-0.1.5-py3-none-any.whl
Upload date: Aug 23, 2025
Size: 15.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.14

File hashes

Hashes for fxencoder_plusplus-0.1.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`850bfb05dae7f88e0ac4d649266f5ec748df8fef911c64b3f2584a5784b37187`
MD5	`2aada0421cc7781c78c1d1b2302fd5db`
BLAKE2b-256	`3ea51c73a6643eca9dbc54e93e0a26129db876406a44cde5e6599c408bc344b1`

See more details on using hashes here.

fxencoder-plusplus 0.1.5

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Fx-Encoder++

About Fx-Encoder++

Architecture

Usage

Installation

Usage

Training

Env

Prepare Fx-Normalized Dataset

Run

Evaluation

LICENSE

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes