Scottish Gaelic speech-to-subtitles pipeline — transcribe, align, punctuate, translate
Project description
eist
Scottish Gaelic speech-to-subtitles pipeline — transcribe, align, punctuate, translate.
eist (Scottish Gaelic for "listen") is a Python library that packages all features of the ÈIST demo into a pip-installable local package following PyTorch / Hugging Face conventions.
Documentation · Models · Demo
Quick start
pip install eist[all]
End-to-end pipeline
from eist import GaelicASRPipeline
pipe = GaelicASRPipeline.from_pretrained()
result = pipe("audio.wav")
for sub in result:
print(f"[{sub.start:.1f}–{sub.end:.1f}] {sub.text}")
# Export as SRT
with open("output.srt", "w") as f:
f.write(result.to_srt())
Component-level usage
from eist import Transcriber, PunctuationModel, SubtitleFormatter
from sk_align import Aligner
# Load individual components
transcriber = Transcriber.from_pretrained()
aligner = Aligner.from_pretrained()
punct = PunctuationModel.from_pretrained()
formatter = SubtitleFormatter(max_words=10, min_words=3)
# Run step-by-step
segments = transcriber.transcribe("audio.wav")
# ... align, punctuate, format as needed
Translation
from eist import Translator
translator = Translator(api_key="sk-...") # or set OPENAI_KEY env var
translator.translate(result.subtitles)
# Export translated SRT
print(result.to_srt(translated=True))
Installation
Install only the components you need:
pip install eist # core types + subtitle formatter (no heavy deps)
pip install eist[transcribe] # + Whisper (faster-whisper, torch)
pip install eist[align] # + forced alignment (sk-align)
pip install eist[punctuate] # + punctuation model (onnxruntime)
pip install eist[all] # everything
Command line
After installing, the eist command is available:
# Transcribe to SRT (default)
eist transcribe audio.wav -o output.srt
# WebVTT format
eist transcribe audio.wav -f vtt -o output.vtt
# Plain text transcript
eist transcribe audio.wav -f txt
# JSON output (includes word-level timestamps)
eist transcribe audio.wav -f json -o output.json
# Skip alignment or punctuation for faster processing
eist transcribe audio.wav --no-align --no-punctuate
# Translate to English (requires OpenAI API key)
eist transcribe audio.wav --translate --api-key sk-...
# Use a specific device / compute type
eist transcribe audio.wav --device cpu --compute-type float32
# Suppress progress messages
eist transcribe audio.wav -q -o output.srt
Run eist transcribe --help for all options. You can also use python -m eist.
Components
| Class | Purpose | Extra |
|---|---|---|
Transcriber |
Whisper-based Scottish Gaelic ASR | [transcribe] |
Aligner |
Forced alignment → word timestamps (via sk-align) | [align] |
PunctuationModel |
Restore .!?,;: and capitalisation |
[punctuate] |
SubtitleFormatter |
Regroup words into readable subtitle segments | (core) |
Translator |
LLM-based Gaelic → English translation | (core) |
GaelicASRPipeline |
End-to-end: audio → subtitles | [all] |
Data types
All components use typed dataclasses:
Word—text,start,endSubtitle—text,start,end,words,paragraph_break,translationTranscriptionResult—subtitles,language,duration,timing.to_srt(),.to_vtt()for export.textfor plain-text transcript- Iterable:
for sub in result: ...
Models
All models are hosted on the eist-edinburgh Hugging Face organisation.
ASR (Speech Recognition)
| Model | Description | WER* | RTF (CPU) |
|---|---|---|---|
| whisper-large-v3-turbo-gaelic-ct2-v2 | CTranslate2 v2 | 11.6% | 0.83× |
| whisper-large-v3-turbo-gaelic-ct2-v3 | CTranslate2 v3 (default) | 12.8% | 1.05× |
| whisper-large-v3-turbo-gaelic | Original Safetensors (0.8 B params) | — | — |
*WER measured on 30 s of a BBC Radio nan Gàidheal broadcast, compared against a word-count-matched manual transcript. RTF = real-time factor (lower is faster); measured on CPU (Intel® Core™ i7, single-thread CTranslate2).
Forced Alignment
| Model | RTF (CPU, 60 s) | Words aligned |
|---|---|---|
| nnet3_alignment_model | 0.07× | 187 / 187 |
Alignment uses sk-align with a Kaldi nnet3 chain model.
Punctuation Restoration
| Model | Punct F1 | Comma F1 | Period F1 | Question F1 | Case F1 |
|---|---|---|---|---|---|
| gaelic-punctuation-model-v2 | 0.63 | 0.61 | 0.67 | 0.57 | 0.83 |
Punctuation F1 scores averaged over two ~30 min BBC Radio nan Gàidheal broadcasts (~9 300 words total).
Benchmark details
All benchmarks run on an Intel® Core™ i7 CPU with float32 precision, no GPU.
Audio from BBC Radio nan Gàidheal broadcasts, ~30 min each.
ASR evaluation capped to 30 s due to CPU speed; alignment capped to 60 s.
Full results in benchmarks/.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file eist-0.1.0.tar.gz.
File metadata
- Download URL: eist-0.1.0.tar.gz
- Upload date:
- Size: 26.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e900a741a07d95d59eeb38add1978514d277bcd5c3f9c1e24c9840758d01c9ab
|
|
| MD5 |
c238f2aac9df3ae74ba60fccc389bc02
|
|
| BLAKE2b-256 |
0a72ab2433ade6b418f77767ec2aca0d6bed47ca7cba6aea4655c5ef96ec481e
|
File details
Details for the file eist-0.1.0-py3-none-any.whl.
File metadata
- Download URL: eist-0.1.0-py3-none-any.whl
- Upload date:
- Size: 23.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e7212991bdeca0de79f2f0a17f2730ef5ec8dbb9149a361533a7c3200ec20185
|
|
| MD5 |
bf4e654b52280bcd9c16098d2ad62bec
|
|
| BLAKE2b-256 |
7faf49753653ee825fd9b1c6491a9c6dd3b208a81f7214032ed410bc1539155f
|