Skip to main content

Internal scribe service that wraps meetscribe for team-scale meeting capture, transcription, summarization, and speaker labeling

Project description

vezir

Self-hosted scribe service for team-scale meeting capture. Vezir wraps meetscribe and turns it into a multi-user, Tailscale-hosted service: a designated scribe records a meeting on their laptop, the audio uploads to a central GPU-equipped box, and the team gets back a diarized transcript, AI summary, and PDF — with speaker labels resolved to GitHub handles via a shared web UI.

Status

Alpha (0.1.3). Designed for small teams that want to keep meeting audio inside their own infrastructure: one Tailscale tailnet + one server (Linux GPU box or Apple Silicon Mac). Currently dogfooded by the Blink team. Linux clients fully supported. macOS Apple Silicon is supported for both client and server roles; on Apple Silicon the server auto-selects MLX Whisper ASR + PyTorch MPS when available, falling back to CPU/MPS-split mode otherwise.

Requires meetscribe-offline >= 0.6.0 (pinned via the [server] extra) which adds matching auto-defaults for --device and --torch-device, so the VEZIR_MEET_DEVICE / VEZIR_MEET_TORCH_DEVICE env vars are now optional even on Macs (still respected as explicit overrides).

Architecture

[Scribe laptop]                       [GPU server]
  vezir scribe / gui / upload ──▶     vezir serve (FastAPI)
   (wraps meet record)                  │
                                        ├── sqlite job queue
                                        │
                                        ▼
                                      worker
                                        │ shells out via HOME-shim
                                        ▼
                                      meet transcribe (unmodified)
                                      meet label --auto
                                      meet sync     ──▶ private git repo
                                        │
                                        ▼
                                      web UI (labeling, dashboard)
                                       ◀── scribe browser

Meetscribe is invoked as an unmodified subprocess. Vezir owns its own job queue, voiceprint database, team roster, and browser auth.

Repo layout

vezir/
  vezir/                    # python package
    cli.py                  # serve, scribe, upload, token issue
    config.py               # paths, env
    server/                 # FastAPI app, queue, worker, meet_runner
    client/                 # vezir scribe (wraps meet record + uploads)
    web/                    # templates + static
  data/
    team.json.example
  infra/
    systemd/vezir.service
  tests/

Runtime data lives outside the repo at ~/vezir-data/.

Install profiles

Role Install command Footprint
Scribe client only (record + upload, GUI optional) pip install --user vezir (or pip install --user 'vezir[gui]' if you also want apt install python3-tk) ~30 MB
Server (FastAPI + worker + dashboard + labeling UI) pip install --user 'vezir[server]' ~3 GB on Linux/CUDA (meetscribe-offline = whisperx + torch + pyannote); on Apple Silicon also pulls mlx-whisper for the MLX ASR backend (~few hundred MB extra)

The split is enforced by pyproject.toml's [project.optional-dependencies]: the base install uses meetscribe-record (capture only). The [server] extra adds meetscribe-offline for the heavy transcription/diarization/summarization pipeline.

On Apple Silicon, the same [server] extra additionally installs mlx-whisper via a PEP 508 environment marker so the MLX ASR backend is available out of the box. Auto-detection selects it at runtime; see the env-var table below for overrides (VEZIR_MEET_ASR_BACKEND, VEZIR_MEET_MLX_MODEL).

Quick start (server, on a GPU box reachable over Tailscale)

git clone https://github.com/pretyflaco/vezir.git
cd vezir
pip install --user -e '.[server]'

# Seed voiceprints from existing meetscribe profile DB
mkdir -p ~/vezir-data
vezir voiceprints seed --from ~/.config/meet/speaker_profiles.json

# Sync target — sandbox repo for development.
# vezir's worker invokes `meet sync --force --meeting-type sandbox-<HHMMSSZ>-<rand>`
# which bypasses meetscribe's schedule and team-presence gates and
# guarantees a unique per-session folder. Every successful job lands in
# meetings/<date>_sandbox-<HHMMSSZ>-<rand>/ on the configured repo
# (e.g. meetings/2026-04-25_sandbox-194051Z-VZJJ3P/).
cat > ~/vezir-data/sync_config.json <<'EOF'
{
  "repo_url": "https://github.com/pretyflaco/vezir-meetings.git",
  "meetings": [],
  "team_members": [],
  "min_team_members": 0
}
EOF

# Initialize team roster (used by labeling UI autocomplete)
cp data/team.json.example ~/vezir-data/team.json
$EDITOR ~/vezir-data/team.json

# Issue a token for yourself
vezir token issue --github kasita

# Start the service
vezir serve

# Or, to skip git sync (artifacts stay only in ~/vezir-data/sessions/<id>/)
VEZIR_SKIP_SYNC=1 vezir serve

Sync target governance

This is intentionally pointed at a private dev sandbox repo (pretyflaco/vezir-meetings) during the pilot. Two reasons:

  • production meeting-archive repos (e.g. blinkbitcoin/blink-wip) get schedule + team-presence gating from meetscribe; vezir uses --force to override that, which is appropriate for a dev sandbox but not for production
  • vezir may rewrite history or recreate the repo while the pipeline is being shaken down

To graduate to production: change repo_url in ~/vezir-data/sync_config.json, drop --force (planned: env var VEZIR_SYNC_FORCE=0), and let meetscribe's existing schedule/team-gate decide what to push.

Quick start (scribe client)

# Install vezir + meetscribe-record (lightweight; ~30 MB).
pip install --user vezir

# Optional: GUI widget (Tkinter); on Debian/Ubuntu:
sudo apt install python3-tk

# Configure (one-time): server URL = Tailscale name of your vezir server.
# If MagicDNS is unavailable, use the server's Tailscale IP instead.
export VEZIR_URL=http://your-vezir-server:8000
export VEZIR_TOKEN=<token-issued-on-server>

# CLI scribe
vezir scribe --title "what this meeting is about"
# Talk; Ctrl+C when done.
# By default, the recorded WAV is compressed to OGG/Opus before upload.
# Use --no-compress to upload the raw WAV instead.

# Or GUI scribe (always-on-top widget)
vezir gui

# Or upload an existing recording (WAV/OGG)
vezir upload ./previous-meeting.wav --title "previous meeting"

# Compress an existing WAV before uploading it
vezir upload ./previous-meeting.wav --compress --title "previous meeting"

When the recording is uploaded, vezir prints a dashboard URL. Open it in your browser; the GUI's "Open dashboard" button does this for you. The URL flows through /login?token=... so the browser is signed in via HttpOnly cookie before it lands on the session page; subsequent access from the same browser does not require re-passing the token.

Live client recordings remain on the scribe machine under ~/meet-recordings/ by default. vezir status is a server-side/local diagnostic command; on a thin client it inspects that machine's local ~/vezir-data and does not query the remote server.

Standalone uploads currently accept .wav and .ogg, matching what the server-side meetscribe pipeline consumes from session folders. Use vezir upload --compress file.wav to compress a WAV to OGG/Opus before uploading. Other formats such as .mp3, .m4a, and .webm should be transcoded to WAV/OGG first until server-side transcoding is added.

The client reports upload progress, retries from byte 0 after transient connection failures, and sends the expected audio byte count so the server can reject incomplete uploads instead of processing partial meetings.

Environment variables

Variable Default Effect
VEZIR_DATA ~/vezir-data All runtime state — sessions, voiceprints, queue, tokens, sync_config
VEZIR_HOST 0.0.0.0 Bind address for vezir serve
VEZIR_PORT 8000 Port for vezir serve
VEZIR_URL http://localhost:8000 Server URL for vezir scribe clients
VEZIR_TOKEN Bearer token for vezir scribe clients
VEZIR_LOG_LEVEL INFO Logging level
VEZIR_MEET_BIN $(which meet) Path to meetscribe meet binary
VEZIR_MEET_DEVICE mps on Apple Silicon when supported by the installed meetscribe stack, cuda when CUDA is available elsewhere, otherwise cpu Device passed to meet transcribe
VEZIR_MEET_COMPUTE_TYPE int8 on CPU, float16 on CUDA, float32 on MPS Compute type passed to meet transcribe
VEZIR_MEET_TORCH_DEVICE auto PyTorch device passed to meet transcribe --torch-device when the installed meetscribe supports split ASR/PyTorch devices
VEZIR_MEET_ASR_BACKEND mlx on Apple Silicon when available ASR backend passed to meet transcribe --asr-backend when supported
VEZIR_MEET_MLX_MODEL meetscribe default MLX Whisper model path/repo passed to meet transcribe --mlx-model
VEZIR_SKIP_SYNC unset Set to 1 to skip the meet sync step entirely
VEZIR_DELETE_AUDIO unset Set to 1 to delete audio after artifacts are produced (storage policy). Default OFF during pilot.
VEZIR_SYNC_MEETING_TYPE sandbox Subfolder name (under meetings/) used by meet sync --force. Will be removed once vezir respects schedules.
VEZIR_MAX_UPLOAD_BYTES 2147483648 Maximum accepted upload size (default 2 GiB). Oversized uploads return HTTP 413.

On Apple Silicon, vezir prefers meetscribe's MLX Whisper ASR backend when mlx-whisper is installed and the installed meet transcribe supports --asr-backend. Alignment and diarization still use PyTorch, so vezir also passes --torch-device mps when that option is available. If MLX ASR is not available, the fallback Apple Silicon route is CPU ASR via CTranslate2 plus PyTorch MPS for alignment/diarization. VEZIR_MEET_ASR_BACKEND, VEZIR_MEET_MLX_MODEL, and VEZIR_MEET_TORCH_DEVICE override the automatic selection.

Performance expectations

End-to-end processing time depends on audio quality, model size, language detection, diarization, summary generation, and whether alignment models are already cached. For a one-hour recording with the default large-v3-turbo-style pipeline, use these as rough operator estimates:

Runtime ASR path PyTorch alignment/diarization path Expected time for 1h audio
NVIDIA CUDA GPU CUDA, float16 CUDA ~5-20 min end-to-end
Apple Silicon MLX mode MLX Whisper MPS ~10-30 min end-to-end
Apple Silicon split mode CPU, int8 via CTranslate2 MPS ~20-45 min end-to-end
CPU only CPU, int8 CPU ~1.5-10 hours end-to-end

ASR is automatic speech recognition: the stage that turns audio into text. In Apple Silicon MLX mode, ASR uses MLX Whisper on the Apple GPU while alignment and diarization use PyTorch MPS. The first run downloads the selected MLX model; subsequent runs use the local Hugging Face cache.

The most useful future improvements are:

  • Add per-stage timing to worker logs so real deployments can compare ASR, alignment, diarization, summary, and sync costs instead of relying on broad estimates.
  • Benchmark mlx-community/whisper-large-v3-turbo, -q4, and -4bit variants on representative meeting audio to choose the best speed/quality default.

Runtime directories are created private (0700) and sensitive runtime files are written private (0600). The systemd unit also sets UMask=0077 so artifacts created by subprocesses inherit private defaults.

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vezir-0.1.3.tar.gz (58.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vezir-0.1.3-py3-none-any.whl (57.1 kB view details)

Uploaded Python 3

File details

Details for the file vezir-0.1.3.tar.gz.

File metadata

  • Download URL: vezir-0.1.3.tar.gz
  • Upload date:
  • Size: 58.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for vezir-0.1.3.tar.gz
Algorithm Hash digest
SHA256 b3a1fa503828dc19185a3638d9a5b9c3d6028075f2b9179f9d5dba645f69e402
MD5 c823ee62977c76e5882ab14e6b44f325
BLAKE2b-256 7ee235a473f602b2267a12f100b192de8fe7da43015046c7fecdb827ab1fdc2e

See more details on using hashes here.

File details

Details for the file vezir-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: vezir-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 57.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for vezir-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 2e4f72da1f8ae35af5aca22917d8731e1bf8e99ff3662b89e6a9166758034add
MD5 2e96b7ad73f8a88e6730968741afd843
BLAKE2b-256 b906efda2edd2d042f80f8adefa6c70bbd466e408a4bcd43448e987efcccce43

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page