Skip to main content

Lingvo utils for Google SVL team

Project description

sidlingvo: Lingvo-based libraries for speaker and language recognition

Python application PyPI Version Python Versions Downloads

Overview

Here we open source some of the Lingvo-based libraries used in our publications.

Disclaimer

This is NOT an official Google product.

Feature frontend and TFLite inference

For the feature frontend and TFLite inference, see the API in siglingvo/fe_utils.py.

For pretrained speaker encoder models, the inference API is in sidlingvo/wav_to_dvector.py.

For pretrained language identifcation models, the inference API is in sidlingvo/wav_to_lang.py.

GE2E and GE2E-XS losses

GE2E and GE2E-XS losses are implemented in sidlingvo/loss_layers.py.

GE2E was proposed in this paper:

GE2E-XS was proposed in this paper:

Attentive temporal pooling

Attentive temporal pooling is implemented in sidlingvo/cumulative_statistics_layer.py.

It is used by these papers:

Attentive scoring

Attentive scoring is implemented in sidlingvo/attentive_scoring_layer.py.

It is proposed in this paper:

Citations

Our papers are cited as:

@inproceedings{wan2018generalized,
  title={Generalized end-to-end loss for speaker verification},
  author={Wan, Li and Wang, Quan and Papir, Alan and Moreno, Ignacio Lopez},
  booktitle={International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  pages={4879--4883},
  year={2018},
  organization={IEEE}
}

@inproceedings{pelecanos2021drvectors,
  title={{Dr-Vectors: Decision Residual Networks and an Improved Loss for Speaker Recognition}},
  author={Jason Pelecanos and Quan Wang and Ignacio Lopez Moreno},
  year={2021},
  booktitle={Proc. Interspeech},
  pages={4603--4607},
  doi={10.21437/Interspeech.2021-641}
}

@inproceedings{pelecanos2022parameter,
  title={Parameter-Free Attentive Scoring for Speaker Verification},
  author={Jason Pelecanos and Quan Wang and Yiling Huang and Ignacio Lopez Moreno},
  booktitle={Odyssey: The Speaker and Language Recognition Workshop},
  year={2022}
}

@inproceedings{wang2022attentive,
  title={Attentive Temporal Pooling for Conformer-based Streaming Language Identification in Long-form Speech},
  author={Quan Wang and Yang Yu and Jason Pelecanos and Yiling Huang and Ignacio Lopez Moreno},
  booktitle={Odyssey: The Speaker and Language Recognition Workshop},
  year={2022}
}

@inproceedings{chojnacka2021speakerstew,
  title={{SpeakerStew: Scaling to many languages with a triaged multilingual text-dependent and text-independent speaker verification system}},
  author={Chojnacka, Roza and Pelecanos, Jason and Wang, Quan and Moreno, Ignacio Lopez},
  booktitle={Prod. Interspeech},
  pages={1064--1068},
  year={2021},
  doi={10.21437/Interspeech.2021-646},
  issn={2958-1796},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sidlingvo-0.0.11.tar.gz (30.8 kB view details)

Uploaded Source

Built Distribution

sidlingvo-0.0.11-py3-none-any.whl (34.6 kB view details)

Uploaded Python 3

File details

Details for the file sidlingvo-0.0.11.tar.gz.

File metadata

  • Download URL: sidlingvo-0.0.11.tar.gz
  • Upload date:
  • Size: 30.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for sidlingvo-0.0.11.tar.gz
Algorithm Hash digest
SHA256 d0caeef21f3f35f702c32f6361e6049a5c2c2a8c4c430a39d1c015811969c976
MD5 6491071bd877c704aad4fddc7c49e327
BLAKE2b-256 325f9c18f98317a32b78359d1710cd3c3abf1f65826230290103a5e026ad3eeb

See more details on using hashes here.

File details

Details for the file sidlingvo-0.0.11-py3-none-any.whl.

File metadata

  • Download URL: sidlingvo-0.0.11-py3-none-any.whl
  • Upload date:
  • Size: 34.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for sidlingvo-0.0.11-py3-none-any.whl
Algorithm Hash digest
SHA256 a415d1d3f0fb006fe5cc4501ec2437a90fc05b2bef4d0006e75968e5145d65b0
MD5 95ec98827cccf21e219603f626280347
BLAKE2b-256 4d888504e45239995bb7b5baa490b347adad75d8ec6c84c1aac8271aac922c48

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page