A library to perform automatic speech recognition with huggingface transformers.
Project description
Elpis Core Library
The Core Elpis Library, providing a quick api to :hugs: transformers for automatic-speech-recognition.
You can use the library to:
- Perform standalone inference using a pretrained HFT model.
- Fine tune a pretrained ASR model on your own dataset.
- Generate text and Elan files from inference results for further analysis.
Documentation
Documentation for the library can be be found here.
Dependencies
While we try to be as machine-independant as possible, there are some dependencies you should be aware of when using this library:
- Processing datasets (
elpis.datasets.processing
) requireslibrosa
, which depends on havinglibsndfile
installed on your computer. If you're using elpis within a docker container, you may have to manually installlibsndfile
. - Transcription (
elpis.transcription.transcribe
) requiresffmpeg
if your audio you're attempting to transcribe needs to be resampled before it can be used. The default sample rate we assume is 16khz. - The preprocessing flow (
elpis.datasets.preprocessing
) is free of external dependencies.
Installation
You can install the elpis library with:
pip3 install elpis
Usage
Below are some typical examples of use cases
Standalone Inference
from pathlib import Path
from elpis.transcriber.results import build_text
from elpis.transcriber.transcribe import build_pipeline, transcribe
# Perform inference
asr = build_pipeline(pretrained_location="facebook/wav2vec2-base-960h")
audio = Path("<to_some_audio_file.wav>")
annotations = transcribe(audio, asr) # Timed, per word annotation data
result = build_text(annotations) # Combine annotations to extract all text
print(result)
# Build output files
text_file = output_dir / "test.txt"
with open(text_file, "w") as output_file:
output_file.write(result)
Fine-tuning a Pretrained Model on Local Dataset
from pathlib import Path
from typing import List
from elpis.datasets import Dataset
from elpis.datasets.dataset import CleaningOptions
from elpis.datasets.preprocessing import process_batch
from elpis.models import ElanOptions, ElanTierSelector
from elpis.trainer.job import TrainingJob, TrainingOptions
from elpis.trainer.trainer import train
from elpis.transcriber.results import build_elan, build_text
from elpis.transcriber.transcribe import build_pipeline, transcribe
files: List[Path] = [...] # A list of paths to the files to include.
dataset = Dataset(
name="dataset",
files=files,
cleaning_options=CleaningOptions(), # Default cleaning options
# Elan data extraction info- required if dataset includes .eaf files.
elan_options=ElanOptions(
selection_mechanism=ElanTierSelector.NAME, selection_value="Phrase"
),
)
# Setup
tmp_path = Path('...')
dataset_dir = tmp_path / "dataset"
model_dir = tmp_path / "model"
output_dir = tmp_path / "output"
# Make all directories
for directory in dataset_dir, model_dir, output_dir:
directory.mkdir(exist_ok=True, parents=True)
# Preprocessing
batches = dataset.to_batches()
for batch in batches:
process_batch(batch, dataset_dir)
# Train the model
job = TrainingJob(
model_name="some_model",
dataset_name="some_dataset",
options=TrainingOptions(epochs=2, learning_rate=0.001),
base_model="facebook/wav2vec2-base-960h"
)
train(
job=job,
output_dir=model_dir,
dataset_dir=dataset_dir,
)
# Perform inference with pipeline
asr = build_pipeline(
pretrained_location=str(model_dir.absolute()),
)
audio = Path("<to_some_audio_file.wav>")
annotations = transcribe(audio, asr)
# Build output files
text_file = output_dir / "test.txt"
with open(text_file, "w") as output_file:
output_file.write(build_text(annotations))
elan_file = output_dir / "test.eaf"
eaf = build_elan(annotations)
eaf.to_file(str(elan_file))
print('voila ;)')
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
elpis-0.1.7.tar.gz
(17.7 kB
view details)
Built Distribution
elpis-0.1.7-py3-none-any.whl
(21.8 kB
view details)
File details
Details for the file elpis-0.1.7.tar.gz
.
File metadata
- Download URL: elpis-0.1.7.tar.gz
- Upload date:
- Size: 17.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.2.2 CPython/3.10.11 Linux/6.2.0-1011-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 245cba929c8949dbb97db56fdd98825a53b0505b233d3ed0b621322baeafb227 |
|
MD5 | 576a12910a9341c2053c96571fe2c451 |
|
BLAKE2b-256 | 99f756fea1d34f2bc84addc05cda63167dc8a424b04a017ff40c91fda2b330eb |
File details
Details for the file elpis-0.1.7-py3-none-any.whl
.
File metadata
- Download URL: elpis-0.1.7-py3-none-any.whl
- Upload date:
- Size: 21.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.2.2 CPython/3.10.11 Linux/6.2.0-1011-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 310dbcebc3087e812501e384491bcdda84fa927477dbf328272983b11ae29429 |
|
MD5 | 23e935a084032a93ee964d81d585ae9c |
|
BLAKE2b-256 | 854e0ad6a11610fbcd13a430e058b27f94bf73a86e6071e37988ef4cac9f6e36 |