Skip to main content

Internal scribe service that wraps meetscribe for team-scale meeting capture, transcription, summarization, and speaker labeling

Project description

vezir

Self-hosted scribe service for team-scale meeting capture. Vezir wraps meetscribe and turns it into a multi-user, Tailscale-hosted service: a designated scribe records a meeting on their laptop, the audio uploads to a central GPU-equipped box, and the team gets back a diarized transcript, AI summary, and PDF — with speaker labels resolved to GitHub handles via a shared web UI.

Status

Alpha (0.1.0). Designed for small teams that want to keep meeting audio inside their own infrastructure: one Tailscale tailnet + one GPU-equipped box. Currently dogfooded by the Blink team. Linux clients fully supported, macOS thin client deferred.

Architecture

[Scribe laptop]                       [GPU server]
  vezir scribe / gui  ──upload──▶     vezir serve (FastAPI)
   (wraps meet record)                  │
                                        ├── sqlite job queue
                                        │
                                        ▼
                                      worker
                                        │ shells out via HOME-shim
                                        ▼
                                      meet transcribe (unmodified)
                                      meet label --auto
                                      meet sync     ──▶ private git repo
                                        │
                                        ▼
                                      web UI (labeling, dashboard)
                                       ◀── scribe browser

Meetscribe is invoked as an unmodified subprocess. Vezir owns its own job queue, voiceprint database, team roster, and browser auth.

Repo layout

vezir/
  vezir/                    # python package
    cli.py                  # serve, scribe, token issue
    config.py               # paths, env
    server/                 # FastAPI app, queue, worker, meet_runner
    client/                 # vezir scribe (wraps meet record + uploads)
    web/                    # templates + static
  data/
    team.json.example
  infra/
    systemd/vezir.service
  tests/

Runtime data lives outside the repo at ~/vezir-data/.

Install profiles

Role Install command Footprint
Scribe client only (record + upload, GUI optional) pip install --user vezir (or pip install --user 'vezir[gui]' if you also want apt install python3-tk) ~30 MB
Server (FastAPI + worker + dashboard + labeling UI) pip install --user 'vezir[server]' ~3 GB (pulls meetscribe-offline = whisperx + torch + pyannote)

The split is enforced by pyproject.toml's [project.optional-dependencies]: the base install uses meetscribe-record (capture only). The [server] extra adds meetscribe-offline for the heavy transcription/diarization/summarization pipeline.

Quick start (server, on a GPU box reachable over Tailscale)

git clone https://github.com/pretyflaco/vezir.git
cd vezir
pip install --user -e '.[server]'

# Seed voiceprints from existing meetscribe profile DB
mkdir -p ~/vezir-data
vezir voiceprints seed --from ~/.config/meet/speaker_profiles.json

# Sync target — sandbox repo for development.
# vezir's worker invokes `meet sync --force --meeting-type sandbox-<HHMMSSZ>-<rand>`
# which bypasses meetscribe's schedule and team-presence gates and
# guarantees a unique per-session folder. Every successful job lands in
# meetings/<date>_sandbox-<HHMMSSZ>-<rand>/ on the configured repo
# (e.g. meetings/2026-04-25_sandbox-194051Z-VZJJ3P/).
cat > ~/vezir-data/sync_config.json <<'EOF'
{
  "repo_url": "https://github.com/pretyflaco/vezir-meetings.git",
  "meetings": [],
  "team_members": [],
  "min_team_members": 0
}
EOF

# Initialize team roster (used by labeling UI autocomplete)
cp data/team.json.example ~/vezir-data/team.json
$EDITOR ~/vezir-data/team.json

# Issue a token for yourself
vezir token issue --github kasita

# Start the service
vezir serve

# Or, to skip git sync (artifacts stay only in ~/vezir-data/sessions/<id>/)
VEZIR_SKIP_SYNC=1 vezir serve

Sync target governance

This is intentionally pointed at a private dev sandbox repo (pretyflaco/vezir-meetings) during the pilot. Two reasons:

  • production meeting-archive repos (e.g. blinkbitcoin/blink-wip) get schedule + team-presence gating from meetscribe; vezir uses --force to override that, which is appropriate for a dev sandbox but not for production
  • vezir may rewrite history or recreate the repo while the pipeline is being shaken down

To graduate to production: change repo_url in ~/vezir-data/sync_config.json, drop --force (planned: env var VEZIR_SYNC_FORCE=0), and let meetscribe's existing schedule/team-gate decide what to push.

Quick start (scribe client)

# Install vezir + meetscribe-record (lightweight; ~30 MB).
pip install --user vezir

# Optional: GUI widget (Tkinter); on Debian/Ubuntu:
sudo apt install python3-tk

# Configure (one-time): server URL = Tailscale name of your vezir server
export VEZIR_URL=http://your-vezir-server:8000
export VEZIR_TOKEN=<token-issued-on-server>

# CLI scribe
vezir scribe --title "what this meeting is about"
# Talk; Ctrl+C when done.

# Or GUI scribe (always-on-top widget)
vezir gui

When the recording is uploaded, vezir prints a dashboard URL. Open it in your browser; the GUI's "Open dashboard" button does this for you. The URL flows through /login?token=... so the browser is signed in via HttpOnly cookie before it lands on the session page; subsequent access from the same browser does not require re-passing the token.

Environment variables

Variable Default Effect
VEZIR_DATA ~/vezir-data All runtime state — sessions, voiceprints, queue, tokens, sync_config
VEZIR_HOST 0.0.0.0 Bind address for vezir serve
VEZIR_PORT 8000 Port for vezir serve
VEZIR_URL http://localhost:8000 Server URL for vezir scribe clients
VEZIR_TOKEN Bearer token for vezir scribe clients
VEZIR_LOG_LEVEL INFO Logging level
VEZIR_MEET_BIN $(which meet) Path to meetscribe meet binary
VEZIR_SKIP_SYNC unset Set to 1 to skip the meet sync step entirely
VEZIR_DELETE_AUDIO unset Set to 1 to delete audio after artifacts are produced (storage policy). Default OFF during pilot.
VEZIR_SYNC_MEETING_TYPE sandbox Subfolder name (under meetings/) used by meet sync --force. Will be removed once vezir respects schedules.

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vezir-0.1.0.tar.gz (40.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vezir-0.1.0-py3-none-any.whl (44.7 kB view details)

Uploaded Python 3

File details

Details for the file vezir-0.1.0.tar.gz.

File metadata

  • Download URL: vezir-0.1.0.tar.gz
  • Upload date:
  • Size: 40.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for vezir-0.1.0.tar.gz
Algorithm Hash digest
SHA256 1ef54fec62b2d5e8d98a10084fa6b0bfcbc079e9bea973faaa4717d2307b680d
MD5 a9f93f920e78e20be91421fd88d7924b
BLAKE2b-256 f65becd6c824dffef523368dacb8161d2adcd667b4076214d89eb31defd776b5

See more details on using hashes here.

File details

Details for the file vezir-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: vezir-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 44.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for vezir-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 eeb3f09388f2eafded5cb2616e02ba8d4599396699b9542ec46d06bbfa42b3e0
MD5 acb1980c24e8ed71e37929d4588a5c7a
BLAKE2b-256 bfcb67752a02828c3f8b23f6c93faa62a4f03182433bf88d0a20fc001fcf9776

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page