One facade over many speech-to-text (ASR) engines, plus a ledger to help you choose between them.
Project description
scribed
One façade over many speech-to-text (ASR) engines — plus a ledger to help you choose between them.
Transcription ("turn this audio into text") is solved a dozen ways: local engines
(Whisper, faster-whisper, whisper.cpp, Vosk), fast cheap cloud APIs (Groq, OpenAI),
and feature-rich premium services (Deepgram, AssemblyAI, Google, ElevenLabs) —
each with its own install, API, pricing, latency, language coverage, and
diarization quirks. scribed gives you a uniform call, a browsable catalog of
every option, and the tools to wrap any of them.
import scribed
text = scribed.transcribe_text("talk.mp3") # just the text, default backend
t = scribed.transcribe("talk.mp3") # full result: text + timed segments
print(t) # -> the transcript
print(t.srt) # -> SRT subtitles
for seg in t:
print(seg.start, seg.speaker, seg.text) # iterate timed (optionally diarized) segments
The same call, the same Transcript back, no matter which engine ran.
Install
import scribed is dependency-free. Install only the backends you use, via extras:
pip install "scribed[faster-whisper]" # local, free — recommended default
pip install "scribed[whispercpp]" # local, free, light (great on Apple Silicon)
pip install "scribed[vosk]" # local, free, streaming/offline
pip install "scribed[openai]" # cloud API (whisper-1 / gpt-4o-transcribe)
pip install "scribed[groq]" # cloud API — fastest & cheapest hosted Whisper
pip install "scribed[deepgram]" # cloud API — real-time + diarization
pip install "scribed[cli]" # the `scribed` command
Backends that ship today
| Backend | backend= id |
Local / Remote | Cost | Diarize | Stream | Notable |
|---|---|---|---|---|---|---|
| faster-whisper | faster-whisper |
local | free | – | – | Recommended local default — Whisper via CTranslate2, no system ffmpeg |
| OpenAI Whisper | whisper |
local | free | – | – | The reference PyTorch Whisper (needs system ffmpeg) |
| whisper.cpp | whispercpp |
local | free | – | ~ | Pure C/C++ — light, excellent on Apple Silicon / edge |
| Vosk | vosk |
local | free | ~ | ✓ | Fully offline, streaming, tiny models (Raspberry Pi / mobile) |
| OpenAI | openai |
remote | paid | – | ✓ | Simple & ubiquitous; whisper-1 / gpt-4o-transcribe |
| Groq | groq |
remote | free tier | – | – | Fastest & cheapest hosted Whisper (OpenAI-compatible) |
| Deepgram | deepgram |
remote | free tier | ✓ | ✓ | Nova-3: real-time + diarization, cheap, feature-rich |
| AssemblyAI | assemblyai |
remote | free tier | ✓ | ✓ | Audio intelligence (diarization, sentiment, topics, summary) |
| Google STT | google-speech |
remote | free tier | ✓ | ✓ | Widest language coverage (125+); Chirp models |
| ElevenLabs | elevenlabs |
remote | free tier | ✓ | ~ | Scribe — top accuracy, diarization, audio-event tags |
…plus more engines catalogued in the ledger (NVIDIA Parakeet/Canary, WhisperX, wav2vec2, Moonshine, sherpa-onnx, AWS Transcribe, Azure Speech, Speechmatics, Gladia, Rev, Fireworks, Cloudflare …) that you can turn into a working façade with one command (see Add a backend below).
Getting a backend running
Some backends need more than pip install (Whisper's system ffmpeg, GPU wheels,
first-run model weights, or an API key). scribed turns that into structured,
OS-aware guidance — handy for humans and AI agents alike:
scribed.doctor() # what's usable now vs what each missing one needs
scribed.check("faster-whisper") # -> True/False (usable right now?)
print(scribed.requirements("whisper").instructions()) # exact plan: system deps + pip + weights
scribed.install("faster-whisper", yes=True) # plan, or actually run the pip install
The ledger — choose with eyes open
The catalog describes every engine we researched, not only the ones with a
working façade. It lives in data (scribed/data/backends.json), not code, so you
can read, filter, diff, and extend it:
scribed.catalog # the whole ledger
scribed.find(is_local=True, open_source=True) # only local OSS engines
scribed.find(diarization="yes", is_remote=True) # speaker-labelling cloud APIs
scribed.find(implemented=True) # only what scribed can run today
scribed.catalog.supports_language("French") # engines that list French
scribed.catalog.to_dataframe() # browse as a pandas table
implemented is computed live from the registry, so the ledger can never lie
about what scribed can actually run.
The result model
Every backend returns the same Transcript — progressive disclosure from "just
the text" to full structure:
t = scribed.transcribe("interview.wav", backend="deepgram", diarize=True)
str(t) # the full transcript text
t.text # same string
t.language # detected language
t.duration # audio duration (seconds)
for seg in t: # Segment: .text .start .end .speaker .confidence .words
...
t.words # flattened word-level units (when the engine reports them)
t.speakers # ['speaker_0', 'speaker_1', ...] when diarized
t.srt # SRT subtitles
t.vtt # WebVTT subtitles
t.raw # the untouched backend response
Three tiers of access
scribed.transcribe(audio) # 1. facade, default backend
scribed.services.deepgram.transcribe(audio, diarize=True) # 2. pick a backend explicitly
scribed.services.deepgram.adapter # 3. the raw engine adapter
CLI
pip install "scribed[cli]"
scribed transcribe talk.mp3 --backend faster-whisper --output srt
scribed backends --capability diarize
scribed find --local --free --diarization
scribed status # readiness table: all ⊇ implemented ⊇ set-up ⊇ tested
scribed doctor # what's usable now, how to install the rest
scribed requirements whisper # exact install plan
Add a backend
The catalog is large; scribed ships a façade for a curated subset and gives you the machinery (and a SKILL) to wrap any other in minutes:
from scribed.make_backend import scaffold_backend, validate_adapter
scaffold_backend("speechmatics") # generate scribed/backends/speechmatics/ from the ledger entry
# ...fill in param_map (config.py) and implement adapter.py's _transcribe...
validate_adapter("speechmatics") # smoke-test it end to end
A backend is just a subpackage with a config.py (BACKEND_CONFIG) and an
adapter.py (Adapter(BaseTranscriberAdapter) implementing _transcribe). The
registry discovers it automatically; engine SDKs are imported lazily so
import scribed stays dependency-free.
Design notes
- Dependency-free import. The base package declares no dependencies; every engine SDK is an optional extra, imported lazily inside its adapter.
- Data-driven ledger. Engine metadata is curated research in JSON, separate from code.
- Normalized everything. One input type (path / URL / bytes / file / numpy
waveform), one result type (
Transcript), one vocabulary of options translated per-engine via each backend'sparam_map.
scribed is the speech-to-text sibling of ocracy
(the same pattern for OCR).
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scribed-0.0.2.tar.gz.
File metadata
- Download URL: scribed-0.0.2.tar.gz
- Upload date:
- Size: 71.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.23 {"installer":{"name":"uv","version":"0.11.23","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d335a0abe97d4030728b23ba2f84e272f7178f6e171a64dfd63ec17183df57e6
|
|
| MD5 |
e7063f19dda73e817d445eeacb3b3169
|
|
| BLAKE2b-256 |
639e674829febc2ee725f6fcd4c89b091451c54de1dd659d0e46ea3c78dc5da2
|
File details
Details for the file scribed-0.0.2-py3-none-any.whl.
File metadata
- Download URL: scribed-0.0.2-py3-none-any.whl
- Upload date:
- Size: 93.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.23 {"installer":{"name":"uv","version":"0.11.23","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2773dd7080c9f05f2ee950f4fb0e77a24edd7139b422aeb63f4d4e2f43adc8b5
|
|
| MD5 |
23aaf4362b02e985c631012563dacdcc
|
|
| BLAKE2b-256 |
c275f1e78f34c5b825b05e889fb5154921215ca8dcdb8d37745c7d2129ca0737
|