Framework for training and evaluating self-supervised learning methods for speaker verification.

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

sslsv

Collection of self-supervised learning (SSL) methods for speaker verification (SV).

Methods

Encoders

TDNN (sslsv.encoders.TDNN)
X-vectors: Robust dnn embeddings for speaker recognition (PDF)
David Snyder, Daniel Garcia-Romero, Gregory Sell, Daniel Povey, Sanjeev Khudanpur
Simple Audio CNN (sslsv.encoders.SimpleAudioCNN)
Representation Learning with Contrastive Predictive Coding (arXiv)
Aaron van den Oord, Yazhe Li, Oriol Vinyals
ResNet-34 (sslsv.encoders.ResNet34)
VoxCeleb2: Deep Speaker Recognition (arXiv)
Joon Son Chung, Arsha Nagrani, Andrew Zisserman
ECAPA-TDNN (sslsv.encoders.ECAPATDNN)
ECAPA-TDNN: Emphasized Channel Attention, Propagation and Aggregation in TDNN Based Speaker Verification (PDF)
Brecht Desplanques, Jenthe Thienpondt, Kris Demuynck

Methods

CPC (sslsv.methods.CPC)
Representation Learning with Contrastive Predictive Coding (arXiv)
Aaron van den Oord, Yazhe Li, Oriol Vinyals
LIM (sslsv.methods.LIM)
Learning Speaker Representations with Mutual Information (arXiv)
Mirco Ravanelli, Yoshua Bengio
SimCLR (sslsv.methods.SimCLR)
A Simple Framework for Contrastive Learning of Visual Representations (arXiv)
Ting Chen, Simon Kornblith, Mohammad Norouzi, Geoffrey Hinton
MoCo v2+ (sslsv.methods.MoCo)
Improved Baselines with Momentum Contrastive Learning (arXiv)
Xinlei Chen, Haoqi Fan, Ross Girshick, Kaiming He
Barlow Twins (sslsv.methods.BarlowTwins)
Barlow Twins: Self-Supervised Learning via Redundancy Reduction (arXiv)
Jure Zbontar, Li Jing, Ishan Misra, Yann LeCun, Stéphane Deny
VICReg (sslsv.methods.VICReg)
VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning (arXiv)
Adrien Bardes, Jean Ponce, Yann LeCun
VIbCReg (sslsv.methods.VIbCReg)
Computer Vision Self-supervised Learning Methods on Time Series (arXiv)
Daesoo Lee, Erlend Aune
BYOL (sslsv.methods.BYOL)
Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning (arXiv)
Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre H. Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Koray Kavukcuoglu, Rémi Munos, Michal Valko
SimSiam (sslsv.methods.SimSiam)
Exploring Simple Siamese Representation Learning (arXiv)
Xinlei Chen, Kaiming He
DINO (sslsv.methods.DINO)
Emerging Properties in Self-Supervised Vision Transformers (arXiv)
Mathilde Caron, Hugo Touvron, Ishan Misra, Hervé Jégou, Julien Mairal, Piotr Bojanowski, Armand Joulin
DeepCluster v2 (sslsv.methods.DeepCluster)
Deep Clustering for Unsupervised Learning of Visual Features (arXiv)
Mathilde Caron, Piotr Bojanowski, Armand Joulin, Matthijs Douze
SwAV (sslsv.methods.SwAV)
Unsupervised Learning of Visual Features by Contrasting Cluster Assignments (arXiv)
Mathilde Caron, Ishan Misra, Julien Mairal, Priya Goyal, Piotr Bojanowski, Armand Joulin

Datasets

Speaker recognition:

VoxCeleb1 (train and test)
VoxCeleb2 (train)
SITW (test)
VOiCES (test)

Language recognition:

VoxLingua107

Emotion recognition:

CREMA-D

Data-augmentation:

Data used for main experiments (conducted on VoxCeleb1 and VoxCeleb2 + data-augmentation) can be automatically downloaded, extracted and prepared using utils/prepare_voxceleb.py and utils/prepare_augmentation.py. The resulting data folder shoud have the following structure:

data
├── musan_split/
├── simulated_rirs/
├── voxceleb1/
├── voxceleb2/
├── voxceleb1_test_O
├── voxceleb1_test_H
├── voxceleb1_test_E
├── voxsrc2021_val
├── voxceleb1_train.csv
└── voxceleb2_train.csv

Other datasets have to be manually downloaded and extracted but their train and trials (only for speaker verification) files can be created using the corresponding script from the utils folder.

Example format of a train file

`voxceleb1_train.csv` ``` File,Speaker voxceleb1/id10001/1zcIwhmdeo4/00001.wav,id10001 ... voxceleb1/id11251/s4R4hvqrhFw/00009.wav,id11251 ```

Example format of a trials file

`voxceleb1_test_O` ``` 1 voxceleb1/id10270/x6uYqmx31kE/00001.wav voxceleb1/id10270/8jEAjG6SegY/00008.wav ... 0 voxceleb1/id10309/0cYFdtyWVds/00005.wav voxceleb1/id10296/Y-qKARMSO7k/00001.wav ```

Please refer to the associated code if you want further details about data preparation.

Usage

Start self-supervised training with python train.py configs/vicreg.yml.

wandb

Use wandb online and wandb offline to toggle wandb. To log your experiments you first need to provide your API key with wandb login API_KEY.

Credits

Some parts of the code (data preparation, data augmentation and model evaluation) were adapted from VoxCeleb trainer repository.

Project details

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

0.0.1

Apr 7, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sslsv-0.0.1.tar.gz (7.0 MB view details)

Uploaded Apr 7, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sslsv-0.0.1-py3-none-any.whl (73.0 kB view details)

Uploaded Apr 7, 2024 Python 3

File details

Details for the file sslsv-0.0.1.tar.gz.

File metadata

Download URL: sslsv-0.0.1.tar.gz
Upload date: Apr 7, 2024
Size: 7.0 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.0.0 CPython/3.8.8

File hashes

Hashes for sslsv-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`58019d7ae5e5ecc91a2dc360022b0a4316db0d2553580b85bc886a9b1bf06424`
MD5	`cab3ac0234e476459cbf7edd1f4cdf35`
BLAKE2b-256	`86bce32d47132326942f928f5f57b27e615b5454aef92dc9a06e11104b50785d`

See more details on using hashes here.

File details

Details for the file sslsv-0.0.1-py3-none-any.whl.

File metadata

Download URL: sslsv-0.0.1-py3-none-any.whl
Upload date: Apr 7, 2024
Size: 73.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.0.0 CPython/3.8.8

File hashes

Hashes for sslsv-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6ccb61e2f772e0e562ab4ee9498458793030ca58105fa5eaebb3a5b82bc7f1bb`
MD5	`3db4bba757023f5fb5a37aba1429b5a9`
BLAKE2b-256	`513c498f9f09c6691251fbc9426308ffca454df574aa114c7989810f75f6e183`

See more details on using hashes here.

sslsv 0.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

sslsv

Methods

Encoders

Methods

Datasets

Usage

wandb

Credits

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes