A library to perform automatic speech recognition with huggingface transformers.
Project description
Elpis Core Library
The Core Elpis Library, providing a quick api to :hugs: transformers for automatic-speech-recognition.
You can use the library to:
- Perform standalone inference using a pretrained HFT model.
- Fine tune a pretrained ASR model on your own dataset.
- Generate text and Elan files from inference results for further analysis.
Documentation
Documentation for the library can be be found here.
Dependencies
While we try to be as machine-independant as possible, there are some dependencies you should be aware of when using this library:
- Processing datasets (
elpis.datasets.processing) requireslibrosa, which depends on havinglibsndfileinstalled on your computer. If you're using elpis within a docker container, you may have to manually installlibsndfile. - Transcription (
elpis.transcription.transcribe) requiresffmpegif your audio you're attempting to transcribe needs to be resampled before it can be used. The default sample rate we assume is 16khz. - The preprocessing flow (
elpis.datasets.preprocessing) is free of external dependencies.
Installation
You can install the elpis library with:
pip3 install elpis
Usage
Below are some typical examples of use cases
Standalone Inference
from pathlib import Path
from elpis.transcriber.results import build_text
from elpis.transcriber.transcribe import build_pipeline, transcribe
# Perform inference
asr = build_pipeline(pretrained_location="facebook/wav2vec2-base-960h")
audio = Path("<to_some_audio_file.wav>")
annotations = transcribe(audio, asr) # Timed, per word annotation data
result = build_text(annotations) # Combine annotations to extract all text
print(result)
# Build output files
text_file = output_dir / "test.txt"
with open(text_file, "w") as output_file:
output_file.write(result)
Fine-tuning a Pretrained Model on Local Dataset
from pathlib import Path
from typing import List
from elpis.datasets import Dataset
from elpis.datasets.dataset import CleaningOptions
from elpis.datasets.preprocessing import process_batch
from elpis.models import ElanOptions, ElanTierSelector
from elpis.trainer.job import TrainingJob, TrainingOptions
from elpis.trainer.trainer import train
from elpis.transcriber.results import build_elan, build_text
from elpis.transcriber.transcribe import build_pipeline, transcribe
files: List[Path] = [...] # A list of paths to the files to include.
dataset = Dataset(
name="dataset",
files=files,
cleaning_options=CleaningOptions(), # Default cleaning options
# Elan data extraction info- required if dataset includes .eaf files.
elan_options=ElanOptions(
selection_mechanism=ElanTierSelector.NAME, selection_value="Phrase"
),
)
# Setup
tmp_path = Path('...')
dataset_dir = tmp_path / "dataset"
model_dir = tmp_path / "model"
output_dir = tmp_path / "output"
# Make all directories
for directory in dataset_dir, model_dir, output_dir:
directory.mkdir(exist_ok=True, parents=True)
# Preprocessing
batches = dataset.to_batches()
for batch in batches:
process_batch(batch, dataset_dir)
# Train the model
job = TrainingJob(
model_name="some_model",
dataset_name="some_dataset",
options=TrainingOptions(epochs=2, learning_rate=0.001),
base_model="facebook/wav2vec2-base-960h"
)
train(
job=job,
output_dir=model_dir,
dataset_dir=dataset_dir,
)
# Perform inference with pipeline
asr = build_pipeline(
pretrained_location=str(model_dir.absolute()),
)
audio = Path("<to_some_audio_file.wav>")
annotations = transcribe(audio, asr)
# Build output files
text_file = output_dir / "test.txt"
with open(text_file, "w") as output_file:
output_file.write(build_text(annotations))
elan_file = output_dir / "test.eaf"
eaf = build_elan(annotations)
eaf.to_file(str(elan_file))
print('voila ;)')
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file elpis-0.2.2.tar.gz.
File metadata
- Download URL: elpis-0.2.2.tar.gz
- Upload date:
- Size: 23.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.2.2 CPython/3.10.11 Linux/6.2.0-1012-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b1b939a1ea8927e95dfee69f338463b6f8a9407e0c277d05c1a41d649c8338c1
|
|
| MD5 |
05a558f9cab53444137df5f47e4e9feb
|
|
| BLAKE2b-256 |
4cbff1f462c354c83557d064f3ec67871a17b93aae10e26ffea6a07056d7104a
|
File details
Details for the file elpis-0.2.2-py3-none-any.whl.
File metadata
- Download URL: elpis-0.2.2-py3-none-any.whl
- Upload date:
- Size: 27.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.2.2 CPython/3.10.11 Linux/6.2.0-1012-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3971ce45c0c720b2909407a75fcebe3ad60fec7603f5a56e5573ef61cfa80d8f
|
|
| MD5 |
20cdfed07afa886a95c788224b2f36d6
|
|
| BLAKE2b-256 |
685501522fec8b2c2c9d86a86272a8b86d3312466bf52fe0712fb9937d0ef8a4
|