scribeflow

Portable, resumable, multi-backend Whisper transcription — runs anywhere, resumes after crashes.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

htahaozlu

These details have not been verified by PyPI

Project description

ScribeFlow — Whisper transcription everywhere

English | Türkçe

Local CPU/GPU, Apple Silicon, or Google Colab. Input from a file, folder, Google Drive, or URL. ScribeFlow auto-selects the right model for the hardware it finds, writes durable checkpoints as it goes, and resumes cleanly after a crash — no duplicated or corrupted output.

Linux macOS Google Colab

Demo

ScribeFlow transcribing a lecture and resuming after a crash

Install

Base install is intentionally small — the pure-Python engine plus the default faster-whisper backend, which runs on CPU out of the box:

pip install scribeflow

ffmpeg is the one required system dependency:

# macOS (Homebrew)
brew install ffmpeg

# Debian / Ubuntu
sudo apt-get install -y ffmpeg

# Fedora
sudo dnf install -y ffmpeg

# Windows (winget)
winget install Gyan.FFmpeg

Optional extras layer in heavier backends, the web UI, and remote sources:

pip install 'scribeflow[gpu]'      # torch + CUDA (documented, not hard-pinned)
pip install 'scribeflow[cpp]'      # whisper.cpp via pywhispercpp — Apple-Silicon Metal path
pip install 'scribeflow[openai]'   # openai-whisper (PyTorch reference baseline)
pip install 'scribeflow[web]'      # FastAPI web UI: scribeflow web
pip install 'scribeflow[url]'      # yt-dlp — transcribe straight from a URL
pip install 'scribeflow[drive]'    # Google Drive API source
pip install 'scribeflow[dev]'      # pytest + ruff + mypy + nbformat

From a clone (editable, with the dev toolchain):

git clone https://github.com/htahaozlu/scribeflow
cd scribeflow
pip install -e '.[dev]'

Run without installing (the `npx` equivalent)

ScribeFlow is a Python CLI — there is no npm/npx; the equivalents are pipx and uv. Once published to PyPI:

pipx install scribeflow                 # isolated global install
uvx scribeflow transcribe lecture.mp4   # run once, no install (like npx)

Before the PyPI release you can run it straight from a clone:

pipx install .            # from the cloned repo

Homebrew (macOS) — planned

After the first PyPI release, a tap will provide a one-liner:

brew install htahaozlu/tap/scribeflow   # planned (post-PyPI)

Note — the [gpu] extra documents the torch + CUDA path but does not hard-pin a CUDA build, so you stay in control of the wheel that matches your driver. See docs/CONFIG.md for the recommended install line.

Quickstart

pip install scribeflow                       # base install (CPU-capable)

scribeflow doctor                            # check ffmpeg / device / backends
scribeflow transcribe ./lecture.mp4          # auto-selects backend + model for this host
scribeflow transcribe ./lecture.mp4 --format srt,vtt

That's it. The .txt transcript is always written; --format adds subtitle and JSON outputs. If the run is interrupted, re-run the same command and ScribeFlow picks up from the last completed chunk.

What it does

ScribeFlow is a single tool that gives you the same transcription pipeline everywhere:

Runs anywhere — local CPU or NVIDIA GPU, Apple Silicon (Metal via whisper.cpp), or Google Colab — and adapts to the hardware it detects.
Input from anywhere — a local file or folder, a URL (yt-dlp), a mounted Google Drive path, or an upload through the web UI.
Auto-selects the model for the detected hardware, with sensible defaults tuned for quality (global default: large-v3-turbo).
Crash-safe and resumable — durable per-chunk checkpoints mean an interrupted run resumes from where it stopped, with no duplicated or corrupted output.
Honest scope — local models only (no cloud APIs in v1); transcripts are best-effort ASR, not verbatim legal records.

Backends & hardware

Every backend normalizes to one output shape, so you can swap them without changing your workflow. ScribeFlow auto-selects based on the host; you can always override with --backend, --model, --device, and --compute-type.

Backend	Best for	Install	Notes
`faster-whisper`	CPU and NVIDIA CUDA (the default)	base install	CUDA → `float16` (≥8 GB VRAM) or `int8_float16`; CPU → `int8`.
`whispercpp`	Apple Silicon — Metal GPU	`pip install 'scribeflow[cpp]'`	Needs a `whisper-cli` binary + a ggml model (see env vars below).
`openai-whisper`	PyTorch reference baseline	`pip install 'scribeflow[openai]'`	The reference implementation; slower, useful for comparison.

Hardware auto-select rules:

Apple Silicon → whisper.cpp on Metal when its binary is available, otherwise faster-whisper CPU int8. ScribeFlow never offers cuda/mps to faster-whisper on macOS-arm64 — that path doesn't exist, so it isn't pretended.
CUDA → float16 for ≥8 GB VRAM, otherwise int8_float16.
CPU → int8.
It never auto-selects tiny/base/distil for Turkish; the global default is large-v3-turbo.

To enable the Apple-Silicon Metal path, point ScribeFlow at your whisper.cpp binary and ggml models:

export SCRIBEFLOW_WHISPERCPP_BIN=/path/to/whisper-cli
export SCRIBEFLOW_WHISPERCPP_MODELS=/path/to/ggml-models

See your host's pick at any time:

scribeflow models           # lists the catalog + this host's auto-pick
scribeflow doctor           # ffmpeg / device / VRAM / RAM / backends checklist

Usage

scribeflow transcribe <source> [options]
scribeflow models      [--want default|speed|quality] [--json] [--ui-lang en|tr]
scribeflow doctor      [--json] [--ui-lang en|tr]
scribeflow gen-notebook <source> -o nb.ipynb [options]
scribeflow web         [--host 127.0.0.1] [--port 8000]
scribeflow --version

`scribeflow transcribe`

The source is a local file/folder, a http(s):// URL, or a drive: path. The kind is inferred from the argument; override it with --source-kind.

Key flags:

Flag	Purpose
`--backend`	`faster-whisper` · `whispercpp` · `openai-whisper`
`--model`	Override the auto-selected model (default: `large-v3-turbo`).
`--device` / `--compute-type`	`cpu` · `cuda`; e.g. `float16`, `int8`, `int8_float16`.
`--want default\|speed\|quality`	Bias the auto-pick toward speed or quality.
`--language` / `-l`	Audio language (Turkish `tr` by default; `auto` to detect).
`--chunk-minutes`	Chunk length for checkpointing (default 20).
`--beam-size`	Decoder beam width (Turkish default: 5).
`--format`	`txt,srt,vtt,json` (comma-separated; `txt` is always written).
`--out`	Durable output dir (transcripts + checkpoints).
`--workspace`	Scratch dir for heavy audio-chunk I/O.
`--cache-dir`	Model download cache.
`--runtime auto\|local\|colab`	Execution target (owns the scratch-vs-durable split).
`--source-kind local\|url\|drive\|upload`	Force the source kind instead of inferring it.
`--overwrite`	Discard any existing run and start fresh.
`--config FILE`	Path to a `scribeflow.toml`.
`--json`	Machine-readable JSON output.
`--ui-lang` / `--lang en\|tr`	Interface language (separate from `--language`).

Examples:

# A whole folder, Turkish, with subtitles
scribeflow transcribe ./lectures/ --format srt,vtt

# A URL (needs the [url] extra), auto-detect language, quality bias
scribeflow transcribe "https://example.com/talk.mp4" -l auto --want quality

# Force a backend/model on capable hardware
scribeflow transcribe ./talk.wav --backend faster-whisper --model large-v3 --device cuda --compute-type float16

# Split scratch vs. durable storage explicitly
scribeflow transcribe ./lecture.mp4 --out ./out --workspace /tmp/scribeflow-scratch

Web UI

With the [web] extra installed, launch a small FastAPI app to upload media and transcribe from the browser:

pip install 'scribeflow[web]'
scribeflow web                                   # http://127.0.0.1:8000
scribeflow web --host 0.0.0.0 --port 8080 --out ./out --workspace /tmp/scribeflow-scratch

Colab

scribeflow gen-notebook emits a runnable .ipynb that mounts Drive, pip-installs ScribeFlow, transcribes, and resumes — top-to-bottom, no editing required:

scribeflow gen-notebook ./lecture.mp4 -o scribeflow_colab.ipynb
scribeflow gen-notebook "https://example.com/talk.mp4" -o talk.ipynb        # url extra auto-wired
scribeflow gen-notebook "drive:My Drive/lectures/week1.mp4" -o week1.ipynb   # drive extra auto-wired

Open the notebook in Colab and run the cells in order.

The Errno-107 split. On Colab, a Google Drive FUSE mount can drop mid-write and raise OSError: [Errno 107] Transport endpoint is not connected. ScribeFlow sidesteps this by keeping heavy, churny I/O (audio chunks, temp files) on local /content scratch (the --workspace), and writing only durable transcripts and checkpoints to Drive (the --out). If the mount blips, your committed transcripts are already safe and the run resumes.

Output formats

The .txt transcript is always written. Add more with --format (comma-separated):

Format	Flag value	Description
Text	`txt`	Plain transcript (always produced).
SubRip	`srt`	Subtitles with timecodes.
WebVTT	`vtt`	Web-native subtitles with timecodes.
JSON	`json`	Structured segments (text + timing) for downstream tooling.

Subtitle timecodes are global: each chunk's local times are shifted by chunk_index * chunk_seconds, so timing stays correct across the whole file.

scribeflow transcribe ./lecture.mp4 --format txt,srt,vtt,json

How resume works

Resume isn't a bolt-on — it's how the engine runs.

Chunk-by-chunk checkpoints. The media is split into chunks; each completed chunk is committed durably to progress.json + chunk_outputs/, using atomic temp-then-replace writes (never a half-written file).
Just re-run. Kill the process and run the same command again → ScribeFlow resumes from the last completed chunk. No duplicated work, no corrupted output.
RunIdentity guard. A resume refuses to silently mix a different backend/model/chunking/options into an existing run — it raises CheckpointIdentityError. Want a clean slate with new settings? Pass --overwrite.

Determinism makes this safe: Turkish defaults use temperature=0.0, condition_on_previous_text=False (with a tail-prompt continuity hint), vad_filter=True, and beam_size=5, so re-running a chunk reproduces the same result.

Config

Configuration resolves from CLI flags → a scribeflow.toml → environment variables, with sensible defaults underneath. Full reference: docs/CONFIG.md.

Common environment variables:

Variable	Purpose
`SCRIBEFLOW_LANG`	Default interface language (`en` / `tr`).
`SCRIBEFLOW_WHISPERCPP_BIN`	Path to the `whisper-cli` binary (Apple Silicon).
`SCRIBEFLOW_WHISPERCPP_MODELS`	Directory holding ggml models for whisper.cpp.
`NO_COLOR`	Disable ANSI colors (also auto-off when piped).

A project-local scribeflow.toml lets you pin defaults:

# scribeflow.toml
backend = "faster-whisper"
model = "large-v3-turbo"
language = "tr"
chunk_minutes = 20
beam_size = 5
formats = ["txt", "srt"]

scribeflow transcribe ./lecture.mp4 --config scribeflow.toml

Contributing

Contributions are welcome — see CONTRIBUTING.md for the dev setup, test, and lint workflow:

pip install -e '.[dev]'
pytest
ruff check .
mypy

License

Licensed under the Apache License 2.0 — see LICENSE.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

htahaozlu

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Jun 3, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scribeflow-0.1.0.tar.gz (1.3 MB view details)

Uploaded Jun 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

scribeflow-0.1.0-py3-none-any.whl (256.5 kB view details)

Uploaded Jun 3, 2026 Python 3

File details

Details for the file scribeflow-0.1.0.tar.gz.

File metadata

Download URL: scribeflow-0.1.0.tar.gz
Upload date: Jun 3, 2026
Size: 1.3 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for scribeflow-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`b4086fd493bc5e9a3ca3c72d27c4edf976a6555bbe2fa3ef067b05287a00ce18`
MD5	`7ef61fe5a7f0760779659da88557ac6c`
BLAKE2b-256	`5f8f47d45371bc50cc0027f0605c7e5161c9136e9700cb723c24e468a9d3206d`

See more details on using hashes here.

Provenance

The following attestation bundles were made for scribeflow-0.1.0.tar.gz:

Publisher: publish.yml on htahaozlu/scribeflow

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: scribeflow-0.1.0.tar.gz
- Subject digest: b4086fd493bc5e9a3ca3c72d27c4edf976a6555bbe2fa3ef067b05287a00ce18
- Sigstore transparency entry: 1708597366
- Sigstore integration time: Jun 3, 2026
Source repository:
- Permalink: htahaozlu/scribeflow@b079de60cb2eac322dc099eaf31e0dfbe64c89aa
- Branch / Tag: refs/heads/main
- Owner: https://github.com/htahaozlu
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@b079de60cb2eac322dc099eaf31e0dfbe64c89aa
- Trigger Event: workflow_dispatch

File details

Details for the file scribeflow-0.1.0-py3-none-any.whl.

File metadata

Download URL: scribeflow-0.1.0-py3-none-any.whl
Upload date: Jun 3, 2026
Size: 256.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for scribeflow-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ec90a0e3d146e9dc8ce3837649339c46b4ba16c9926db73ac13ace1da2048ce4`
MD5	`ddbee9325157c79a3e4a3e95745f6ba7`
BLAKE2b-256	`49173958674a7165c6cbbd29900adb253a7025d789a28f35ac0fcf8c3453a15c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for scribeflow-0.1.0-py3-none-any.whl:

Publisher: publish.yml on htahaozlu/scribeflow

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: scribeflow-0.1.0-py3-none-any.whl
- Subject digest: ec90a0e3d146e9dc8ce3837649339c46b4ba16c9926db73ac13ace1da2048ce4
- Sigstore transparency entry: 1708597375
- Sigstore integration time: Jun 3, 2026
Source repository:
- Permalink: htahaozlu/scribeflow@b079de60cb2eac322dc099eaf31e0dfbe64c89aa
- Branch / Tag: refs/heads/main
- Owner: https://github.com/htahaozlu
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@b079de60cb2eac322dc099eaf31e0dfbe64c89aa
- Trigger Event: workflow_dispatch

scribeflow 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Demo

Install

Run without installing (the npx equivalent)

Homebrew (macOS) — planned

Quickstart

What it does

Backends & hardware

Usage

scribeflow transcribe

Web UI

Colab

Output formats

How resume works

Config

Contributing

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

Run without installing (the `npx` equivalent)

`scribeflow transcribe`