Skip to main content

A library to perform automatic speech recognition with huggingface transformers.

Project description

Elpis Core Library

The Core Elpis Library, providing a quick api to :hugs: transformers for automatic-speech-recognition.

You can use the library to:

  • Perform standalone inference using a pretrained HFT model.
  • Fine tune a pretrained ASR model on your own dataset.
  • Generate text and Elan files from inference results for further analysis.

Documentation

Documentation for the library can be be found here.

Dependencies

While we try to be as machine-independant as possible, there are some dependencies you should be aware of when using this library:

  • Processing datasets (elpis.datasets.processing) requires librosa, which depends on having libsndfile installed on your computer. If you're using elpis within a docker container, you may have to manually install libsndfile.
  • Transcription (elpis.transcription.transcribe) requires ffmpeg if your audio you're attempting to transcribe needs to be resampled before it can be used. The default sample rate we assume is 16khz.
  • The preprocessing flow (elpis.datasets.preprocessing) is free of external dependencies.

Installation

You can install the elpis library with: pip3 install elpis

Usage

Below are some typical examples of use cases

Standalone Inference

from pathlib import Path

from elpis.transcriber.results import build_text
from elpis.transcriber.transcribe import build_pipeline, transcribe

# Perform inference
asr = build_pipeline(pretrained_location="facebook/wav2vec2-base-960h")
audio = Path("<to_some_audio_file.wav>")
annotations = transcribe(audio, asr) # Timed, per word annotation data

result = build_text(annotations) # Combine annotations to extract all text
print(result)

# Build output files
text_file = output_dir / "test.txt"
with open(text_file, "w") as output_file:
    output_file.write(result)

Fine-tuning a Pretrained Model on Local Dataset

from pathlib import Path
from typing import List

from elpis.datasets import Dataset
from elpis.datasets.dataset import CleaningOptions
from elpis.datasets.preprocessing import process_batch
from elpis.models import ElanOptions, ElanTierSelector
from elpis.trainer.job import TrainingJob, TrainingOptions
from elpis.trainer.trainer import train
from elpis.transcriber.results import build_elan, build_text
from elpis.transcriber.transcribe import build_pipeline, transcribe

files: List[Path] = [...] # A list of paths to the files to include.

dataset = Dataset(
    name="dataset",
    files=files,
    cleaning_options=CleaningOptions(), # Default cleaning options
    # Elan data extraction info- required if dataset includes .eaf files.
    elan_options=ElanOptions(
        selection_mechanism=ElanTierSelector.NAME, selection_value="Phrase"
    ),
)

# Setup
tmp_path = Path('...')

dataset_dir = tmp_path / "dataset"
model_dir = tmp_path / "model"
output_dir = tmp_path / "output"

# Make all directories
for directory in dataset_dir, model_dir, output_dir:
    directory.mkdir(exist_ok=True, parents=True)

# Preprocessing
batches = dataset.to_batches()
for batch in batches:
    process_batch(batch, dataset_dir)

# Train the model
job = TrainingJob(
    model_name="some_model",
    dataset_name="some_dataset",
    options=TrainingOptions(epochs=2, learning_rate=0.001),
    base_model="facebook/wav2vec2-base-960h"
)
train(
    job=job,
    output_dir=model_dir,
    dataset_dir=dataset_dir,
)

# Perform inference with pipeline
asr = build_pipeline(
    pretrained_location=str(model_dir.absolute()),
)
audio = Path("<to_some_audio_file.wav>")
annotations = transcribe(audio, asr)

# Build output files
text_file = output_dir / "test.txt"
with open(text_file, "w") as output_file:
    output_file.write(build_text(annotations))

elan_file = output_dir / "test.eaf"
eaf = build_elan(annotations)
eaf.to_file(str(elan_file))

print('voila ;)')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

elpis-0.1.4.tar.gz (15.7 kB view details)

Uploaded Source

Built Distribution

elpis-0.1.4-py3-none-any.whl (18.8 kB view details)

Uploaded Python 3

File details

Details for the file elpis-0.1.4.tar.gz.

File metadata

  • Download URL: elpis-0.1.4.tar.gz
  • Upload date:
  • Size: 15.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.2.2 CPython/3.10.6 Darwin/21.5.0

File hashes

Hashes for elpis-0.1.4.tar.gz
Algorithm Hash digest
SHA256 171c305e2c30c0adffa298515fa85279f32d9d31d6ad1997ef32b89502b23506
MD5 e0938b0b171b3bc21c0f61a48b9b8ed6
BLAKE2b-256 9c0587bc15f74fea0486a7e8c7943eebac7352faf01da6fcb0c2dd07dec0a7c6

See more details on using hashes here.

File details

Details for the file elpis-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: elpis-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 18.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.2.2 CPython/3.10.6 Darwin/21.5.0

File hashes

Hashes for elpis-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 a6e9d56bf713d984329e1fa568056269a9427c7d3202e4c8c3bd2360539a923d
MD5 3d06e83e6210e34d4143607438746200
BLAKE2b-256 3d1c7015165bcf302ed5849690ecaa0fff6f1160363e5e82e47b937559b345b9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page