ESPnet Model Zoo

These details have not been verified by PyPI

Project links

Homepage

Project description

ESPnet Model Zoo

Utilities managing the pretrained models created by ESPnet. This function is inspired by the Asteroid pretrained model function.

From version 0.1.0, the huggingface models can be also used: https://huggingface.co/models?filter=espnet
Zenodo community: https://zenodo.org/communities/espnet/
Registered models: table.csv

Install

pip install torch
pip install espnet_model_zoo

Python API for inference

model_name in the following section should be huggingface_id or one of the tags in the table.csv. Or you can directly provide zenodo URL (e.g., https://zenodo.org/record/xxxxxxx/files/hogehoge.zip?download=1).

ASR

import soundfile
from espnet2.bin.asr_inference import Speech2Text
speech2text = Speech2Text.from_pretrained(
    "model_name",
    # Decoding parameters are not included in the model file
    maxlenratio=0.0,
    minlenratio=0.0,
    beam_size=20,
    ctc_weight=0.3,
    lm_weight=0.5,
    penalty=0.0,
    nbest=1
)
# Confirm the sampling rate is equal to that of the training corpus.
# If not, you need to resample the audio data before inputting to speech2text
speech, rate = soundfile.read("speech.wav")
nbests = speech2text(speech)

text, *_ = nbests[0]
print(text)

TTS

import soundfile
from espnet2.bin.tts_inference import Text2Speech
text2speech = Text2Speech.from_pretrained("model_name")
speech = text2speech("foobar")["wav"]
soundfile.write("out.wav", speech.numpy(), text2speech.fs, "PCM_16")

Speech separation

import soundfile
from espnet2.bin.enh_inference import SeparateSpeech
separate_speech = SeparateSpeech.from_pretrained(
    "model_name",
    # for segment-wise process on long speech
    segment_size=2.4,
    hop_size=0.8,
    normalize_segment_scale=False,
    show_progressbar=True,
    ref_channel=None,
    normalize_output_wav=True,
)
# Confirm the sampling rate is equal to that of the training corpus.
# If not, you need to resample the audio data before inputting to speech2text
speech, rate = soundfile.read("long_speech.wav")
waves = separate_speech(speech[None, ...], fs=rate)

This API allows processing both short audio samples and long audio samples. For long audio samples, you can set the value of arguments segment_size, hop_size (optionally normalize_segment_scale and show_progressbar) to perform segment-wise speech enhancement/separation on the input speech. Note that the segment-wise processing is disabled by default.

For old ESPnet (<=10.1)

ASR

import soundfile
from espnet_model_zoo.downloader import ModelDownloader
from espnet2.bin.asr_inference import Speech2Text
d = ModelDownloader()
speech2text = Speech2Text(
    **d.download_and_unpack("model_name"),
    # Decoding parameters are not included in the model file
    maxlenratio=0.0,
    minlenratio=0.0,
    beam_size=20,
    ctc_weight=0.3,
    lm_weight=0.5,
    penalty=0.0,
    nbest=1
)

TTS

import soundfile
from espnet_model_zoo.downloader import ModelDownloader
from espnet2.bin.tts_inference import Text2Speech
d = ModelDownloader()
text2speech = Text2Speech(**d.download_and_unpack("model_name"))

Speech separation

import soundfile
from espnet_model_zoo.downloader import ModelDownloader
from espnet2.bin.enh_inference import SeparateSpeech
d = ModelDownloader()
separate_speech = SeparateSpeech(
    **d.download_and_unpack("model_name"),
    # for segment-wise process on long speech
    segment_size=2.4,
    hop_size=0.8,
    normalize_segment_scale=False,
    show_progressbar=True,
    ref_channel=None,
    normalize_output_wav=True,
)

Instruction for ModelDownloader

from espnet_model_zoo.downloader import ModelDownloader
d = ModelDownloader("~/.cache/espnet")  # Specify cachedir
d = ModelDownloader()  # <module_dir> is used as cachedir by default

To obtain a model, you need to give a huggingface_idmodel` or a tag , which is listed in table.csv.

>>> d.download_and_unpack("kamo-naoyuki/mini_an4_asr_train_raw_bpe_valid.acc.best")
{"asr_train_config": <config path>, "asr_model_file": <model path>, ...}

You can specify the revision if it's huggingface_id giving with @:

>>> d.download_and_unpack("kamo-naoyuki/mini_an4_asr_train_raw_bpe_valid.acc.best@<revision>")
{"asr_train_config": <config path>, "asr_model_file": <model path>, ...}

Note that if the model already exists, you can skip downloading and unpacking.

You can also get a model with certain conditions.

d.download_and_unpack(task="asr", corpus="wsj")

If multiple models are found with the condition, the last model is selected. You can also specify the condition using "version" option.

d.download_and_unpack(task="asr", corpus="wsj", version=-1)  # Get the last model
d.download_and_unpack(task="asr", corpus="wsj", version=-2)  # Get previous model

You can also obtain it from the URL directly.

d.download_and_unpack("https://zenodo.org/record/...")

If you need to use a local model file using this API, you can also give it.

d.download_and_unpack("./some/where/model.zip")

In this case, the contents are also expanded in the cache directory, but the model is identified by the file path, so if you move the model to somewhere and unpack again, it's treated as another model, thus the contents are expanded again at another place.

Query model names

You can view the model names from our Zenodo community, https://zenodo.org/communities/espnet/, or using query(). All information are written in table.csv.

d.query("name")

You can also show them with specifying certain conditions.

d.query("name", task="asr")

Command line tools

espnet_model_zoo_query

# Query model name
espnet_model_zoo_query task=asr corpus=wsj
# Show all model name
espnet_model_zoo_query
# Query the other key
espnet_model_zoo_query --key url task=asr corpus=wsj

espnet_model_zoo_download

espnet_model_zoo_download <model_name>  # Print the path of the downloaded file
espnet_model_zoo_download --unpack true <model_name>   # Print the path of unpacked files

espnet_model_zoo_upload

export ACCESS_TOKEN=<access_token>
espnet_zenodo_upload \
    --file <packed_model> \
    --title <title> \
    --description <description> \
    --creator_name <your-git-account>

Use pretrained model in ESPnet recipe

# e.g. ASR WSJ task
git clone https://github.com/espnet/espnet
pip install -e .
cd egs2/wsj/asr1
./run.sh --skip_data_prep false --skip_train true --download_model kamo-naoyuki/wsj

Register your model

Huggingface

Upload your model using huggingface API

Coming soon...
Create a Pull Request to modify table.csv

The models registered in this table.csv, the model are tested in the CI. Indeed, the model can be downloaded without modification table.csv.
(Administrator does) Increment the third version number of setup.py, e.g. 0.0.3 -> 0.0.4
(Administrator does) Release new version

Zenodo (Obsolete)

Upload your model to Zenodo

You need to signup to Zenodo and create an access token to upload models. You can upload your own model by using espnet_model_zoo_upload command freely, but we normally upload a model using recipes.
Create a Pull Request to modify table.csv

You need to append your record at the last line.
(Administrator does) Increment the third version number of setup.py, e.g. 0.0.3 -> 0.0.4
(Administrator does) Release new version

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.1.7

Oct 11, 2021

0.1.6

Oct 11, 2021

0.1.5

Sep 23, 2021

0.1.4

Sep 22, 2021

0.1.3

Sep 11, 2021

0.1.2

Sep 7, 2021

0.1.1a2 pre-release

Sep 4, 2021

0.1.1a1 pre-release

Sep 4, 2021

0.1.0

Aug 26, 2021

0.0.0a30 pre-release

Jun 10, 2021

0.0.0a29 pre-release

Apr 13, 2021

0.0.0a28 pre-release

Mar 2, 2021

0.0.0a27 pre-release

Feb 17, 2021

0.0.0a26 pre-release

Feb 16, 2021

0.0.0a25 pre-release

Feb 15, 2021

0.0.0a24 pre-release

Jan 15, 2021

0.0.0a23 pre-release

Jan 15, 2021

0.0.0a22 pre-release

Jan 6, 2021

0.0.0a21 pre-release

Dec 24, 2020

0.0.0a20 pre-release

Nov 21, 2020

0.0.0a19 pre-release

Nov 4, 2020

0.0.0a18 pre-release

Oct 19, 2020

0.0.0a16 pre-release

Sep 19, 2020

0.0.0a15 pre-release

Sep 16, 2020

0.0.0a14 pre-release

Sep 16, 2020

0.0.0a13 pre-release

Sep 15, 2020

0.0.0a12 pre-release

Sep 11, 2020

0.0.0a11 pre-release

Sep 5, 2020

0.0.0a10 pre-release

Aug 29, 2020

0.0.0a9 pre-release

Aug 27, 2020

0.0.0a8 pre-release

Aug 18, 2020

0.0.0a7 pre-release

Aug 18, 2020

0.0.0a6 pre-release

Aug 15, 2020

0.0.0a5 pre-release

Jul 30, 2020

0.0.0a4 pre-release

Jul 30, 2020

0.0.0a3 pre-release

Jul 23, 2020

0.0.0a2 pre-release

Jul 23, 2020

0.0.0a1 pre-release

Jul 20, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

espnet_model_zoo-0.1.7.tar.gz (23.1 kB view details)

Uploaded Oct 11, 2021 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

espnet_model_zoo-0.1.7-py3-none-any.whl (19.6 kB view details)

Uploaded Oct 11, 2021 Python 3

File details

Details for the file espnet_model_zoo-0.1.7.tar.gz.

File metadata

Download URL: espnet_model_zoo-0.1.7.tar.gz
Upload date: Oct 11, 2021
Size: 23.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for espnet_model_zoo-0.1.7.tar.gz
Algorithm	Hash digest
SHA256	`61d88a1898d7d6bfebeb51100f194fa7fc9b68f959913255ee5ccf68090465b0`
MD5	`309dc1b492c40677c637158452dbbaa9`
BLAKE2b-256	`d8883b49dca3f981380746ea5c6e766360aac42b28466c9fc9a29096669d9e45`

See more details on using hashes here.

File details

Details for the file espnet_model_zoo-0.1.7-py3-none-any.whl.

File metadata

Download URL: espnet_model_zoo-0.1.7-py3-none-any.whl
Upload date: Oct 11, 2021
Size: 19.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for espnet_model_zoo-0.1.7-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8a228c44566f931e3113ec28c41f1e342be8b8897bad2fa99a6d51263e033743`
MD5	`a7a6a6d43e408d350f9377930b060911`
BLAKE2b-256	`190eb3340d59ded1ece54cca41d88f0be529598b43b3ec05b608dc03bd154a1a`

See more details on using hashes here.

espnet-model-zoo 0.1.7

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

ESPnet Model Zoo

Install

Python API for inference

ASR

TTS

Speech separation

ASR

TTS

Speech separation

Instruction for ModelDownloader

Query model names

Command line tools

Use pretrained model in ESPnet recipe

Register your model

Huggingface

Zenodo (Obsolete)

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes