Skip to main content

Scottish Gaelic speech-to-subtitles pipeline — transcribe, align, punctuate, translate

Project description

eist

version v0.1.0 python ≥3.10 license MIT tests passing

Scottish Gaelic speech-to-subtitles pipeline — transcribe, align, punctuate, translate.

eist (Scottish Gaelic for "listen") is a Python library that packages all features of the ÈIST demo into a pip-installable local package following PyTorch / Hugging Face conventions.

Documentation · Models · Demo

Quick start

pip install eist[all]

End-to-end pipeline

from eist import GaelicASRPipeline

pipe = GaelicASRPipeline.from_pretrained()
result = pipe("audio.wav")

for sub in result:
    print(f"[{sub.start:.1f}{sub.end:.1f}] {sub.text}")

# Export as SRT
with open("output.srt", "w") as f:
    f.write(result.to_srt())

Component-level usage

from eist import Transcriber, PunctuationModel, SubtitleFormatter
from sk_align import Aligner

# Load individual components
transcriber = Transcriber.from_pretrained()
aligner = Aligner.from_pretrained()
punct = PunctuationModel.from_pretrained()
formatter = SubtitleFormatter(max_words=10, min_words=3)

# Run step-by-step
segments = transcriber.transcribe("audio.wav")
# ... align, punctuate, format as needed

Translation

from eist import Translator

translator = Translator(api_key="sk-...")  # or set OPENAI_KEY env var
translator.translate(result.subtitles)

# Export translated SRT
print(result.to_srt(translated=True))

Installation

Install only the components you need:

pip install eist                    # core types + subtitle formatter (no heavy deps)
pip install eist[transcribe]        # + Whisper (faster-whisper, torch)
pip install eist[align]             # + forced alignment (sk-align)
pip install eist[punctuate]         # + punctuation model (onnxruntime)
pip install eist[all]               # everything

Command line

After installing, the eist command is available:

# Transcribe to SRT (default)
eist transcribe audio.wav -o output.srt

# WebVTT format
eist transcribe audio.wav -f vtt -o output.vtt

# Plain text transcript
eist transcribe audio.wav -f txt

# JSON output (includes word-level timestamps)
eist transcribe audio.wav -f json -o output.json

# Skip alignment or punctuation for faster processing
eist transcribe audio.wav --no-align --no-punctuate

# Translate to English (requires OpenAI API key)
eist transcribe audio.wav --translate --api-key sk-...

# Use a specific device / compute type
eist transcribe audio.wav --device cpu --compute-type float32

# Suppress progress messages
eist transcribe audio.wav -q -o output.srt

Run eist transcribe --help for all options. You can also use python -m eist.

Components

Class Purpose Extra
Transcriber Whisper-based Scottish Gaelic ASR [transcribe]
Aligner Forced alignment → word timestamps (via sk-align) [align]
PunctuationModel Restore .!?,;: and capitalisation [punctuate]
SubtitleFormatter Regroup words into readable subtitle segments (core)
Translator LLM-based Gaelic → English translation (core)
GaelicASRPipeline End-to-end: audio → subtitles [all]

Data types

All components use typed dataclasses:

  • Wordtext, start, end
  • Subtitletext, start, end, words, paragraph_break, translation
  • TranscriptionResultsubtitles, language, duration, timing
    • .to_srt(), .to_vtt() for export
    • .text for plain-text transcript
    • Iterable: for sub in result: ...

Models

All models are hosted on the eist-edinburgh Hugging Face organisation.

ASR (Speech Recognition)

Model Description WER* RTF (CPU)
whisper-large-v3-turbo-gaelic-ct2-v2 CTranslate2 v2 11.6% 0.83×
whisper-large-v3-turbo-gaelic-ct2-v3 CTranslate2 v3 (default) 12.8% 1.05×
whisper-large-v3-turbo-gaelic Original Safetensors (0.8 B params)

*WER measured on 30 s of a BBC Radio nan Gàidheal broadcast, compared against a word-count-matched manual transcript. RTF = real-time factor (lower is faster); measured on CPU (Intel® Core™ i7, single-thread CTranslate2).

Forced Alignment

Model RTF (CPU, 60 s) Words aligned
nnet3_alignment_model 0.07× 187 / 187

Alignment uses sk-align with a Kaldi nnet3 chain model.

Punctuation Restoration

Model Punct F1 Comma F1 Period F1 Question F1 Case F1
gaelic-punctuation-model-v2 0.63 0.61 0.67 0.57 0.83

Punctuation F1 scores averaged over two ~30 min BBC Radio nan Gàidheal broadcasts (~9 300 words total).

Benchmark details

All benchmarks run on an Intel® Core™ i7 CPU with float32 precision, no GPU. Audio from BBC Radio nan Gàidheal broadcasts, ~30 min each. ASR evaluation capped to 30 s due to CPU speed; alignment capped to 60 s. Full results in benchmarks/.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

eist-0.1.0.tar.gz (26.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

eist-0.1.0-py3-none-any.whl (23.7 kB view details)

Uploaded Python 3

File details

Details for the file eist-0.1.0.tar.gz.

File metadata

  • Download URL: eist-0.1.0.tar.gz
  • Upload date:
  • Size: 26.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for eist-0.1.0.tar.gz
Algorithm Hash digest
SHA256 e900a741a07d95d59eeb38add1978514d277bcd5c3f9c1e24c9840758d01c9ab
MD5 c238f2aac9df3ae74ba60fccc389bc02
BLAKE2b-256 0a72ab2433ade6b418f77767ec2aca0d6bed47ca7cba6aea4655c5ef96ec481e

See more details on using hashes here.

File details

Details for the file eist-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: eist-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 23.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for eist-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e7212991bdeca0de79f2f0a17f2730ef5ec8dbb9149a361533a7c3200ec20185
MD5 bf4e654b52280bcd9c16098d2ad62bec
BLAKE2b-256 7faf49753653ee825fd9b1c6491a9c6dd3b208a81f7214032ed410bc1539155f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page