A library to perform automatic speech recognition with huggingface transformers.
Project description
Elpis Core Library
The Core Elpis Library, providing a quick api to :hugs: transformers for automatic-speech-recognition.
You can use the library to:
- Perform standalone inference using a pretrained HFT model.
- Fine tune a pretrained ASR model on your own dataset.
- Generate text and Elan files from inference results for further analysis.
Documentation
Documentation for the library can be be found here.
Dependencies
While we try to be as machine-independant as possible, there are some dependencies you should be aware of when using this library:
- Processing datasets (
elpis.datasets.processing
) requireslibrosa
, which depends on havinglibsndfile
installed on your computer. If you're using elpis within a docker container, you may have to manually installlibsndfile
. - Transcription (
elpis.transcription.transcribe
) requiresffmpeg
if your audio you're attempting to transcribe needs to be resampled before it can be used. The default sample rate we assume is 16khz. - The preprocessing flow (
elpis.datasets.preprocessing
) is free of external dependencies.
Installation
You can install the elpis library with:
pip3 install elpis
Usage
Below are some typical examples of use cases
Standalone Inference
from pathlib import Path
from elpis.transcriber.results import build_text
from elpis.transcriber.transcribe import build_pipeline, transcribe
# Perform inference
asr = build_pipeline(pretrained_location="facebook/wav2vec2-base-960h")
audio = Path("<to_some_audio_file.wav>")
annotations = transcribe(audio, asr) # Timed, per word annotation data
result = build_text(annotations) # Combine annotations to extract all text
print(result)
# Build output files
text_file = output_dir / "test.txt"
with open(text_file, "w") as output_file:
output_file.write(result)
Fine-tuning a Pretrained Model on Local Dataset
from pathlib import Path
from typing import List
from elpis.datasets import Dataset
from elpis.datasets.dataset import CleaningOptions
from elpis.datasets.preprocessing import process_batch
from elpis.models import ElanOptions, ElanTierSelector
from elpis.trainer.job import TrainingJob, TrainingOptions
from elpis.trainer.trainer import train
from elpis.transcriber.results import build_elan, build_text
from elpis.transcriber.transcribe import build_pipeline, transcribe
files: List[Path] = [...] # A list of paths to the files to include.
dataset = Dataset(
name="dataset",
files=files,
cleaning_options=CleaningOptions(), # Default cleaning options
# Elan data extraction info- required if dataset includes .eaf files.
elan_options=ElanOptions(
selection_mechanism=ElanTierSelector.NAME, selection_value="Phrase"
),
)
# Setup
tmp_path = Path('...')
dataset_dir = tmp_path / "dataset"
model_dir = tmp_path / "model"
output_dir = tmp_path / "output"
# Make all directories
for directory in dataset_dir, model_dir, output_dir:
directory.mkdir(exist_ok=True, parents=True)
# Preprocessing
batches = dataset.to_batches()
for batch in batches:
process_batch(batch, dataset_dir)
# Train the model
job = TrainingJob(
model_name="some_model",
dataset_name="some_dataset",
options=TrainingOptions(epochs=2, learning_rate=0.001),
base_model="facebook/wav2vec2-base-960h"
)
train(
job=job,
output_dir=model_dir,
dataset_dir=dataset_dir,
)
# Perform inference with pipeline
asr = build_pipeline(
pretrained_location=str(model_dir.absolute()),
)
audio = Path("<to_some_audio_file.wav>")
annotations = transcribe(audio, asr)
# Build output files
text_file = output_dir / "test.txt"
with open(text_file, "w") as output_file:
output_file.write(build_text(annotations))
elan_file = output_dir / "test.eaf"
eaf = build_elan(annotations)
eaf.to_file(str(elan_file))
print('voila ;)')
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
elpis-0.1.5.tar.gz
(15.7 kB
view details)
Built Distribution
elpis-0.1.5-py3-none-any.whl
(18.8 kB
view details)
File details
Details for the file elpis-0.1.5.tar.gz
.
File metadata
- Download URL: elpis-0.1.5.tar.gz
- Upload date:
- Size: 15.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.2.2 CPython/3.10.6 Darwin/21.5.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e1b98f9848d909f79f2f1452e73e37c2f3598c640fdda044936fb3ca1481a8f2 |
|
MD5 | bd8e6ea7e989bd587dc21ca36ba6db4a |
|
BLAKE2b-256 | 3c32a0c2b24a5380984d26341914faf9d2ed08ef6ef29d289ca2b0a4ae105383 |
File details
Details for the file elpis-0.1.5-py3-none-any.whl
.
File metadata
- Download URL: elpis-0.1.5-py3-none-any.whl
- Upload date:
- Size: 18.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.2.2 CPython/3.10.6 Darwin/21.5.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a6b468219220b726d2dcb568abe654734980fd636baea6d25f3573d5fb1c530f |
|
MD5 | 23debc541e309d3db32f5f8d75911aa7 |
|
BLAKE2b-256 | 61d6529e174953decba565e089a5e4605fc28695059c72814b71094aa22c70af |