Skip to main content

Lightweight CLI for ingesting, enriching, and storing meeting transcripts

Project description

transcribe-it

A lightweight CLI for ingesting meeting transcripts (Gmail or Slack), enriching them with an LLM, and storing the results as local files.

Source -> Extract -> LLM Enrich -> Local Files

Prerequisites

  • Python 3.12+
  • A Google Cloud project with Gmail API + Google Drive API enabled (for the Gmail source), or a Slack bot token (for the Slack source)
  • An API key for one of the supported LLM providers (Anthropic, OpenAI, or Groq) — only needed if you want LLM enrichment

Install

uv tool install transcribe-it

Or with pipx:

pipx install transcribe-it

Setup

Setup is split into two steps: a one-off global step for credentials, and a per-project step for what to ingest.

Step 1: Configure credentials (once per machine)

transcribe-it setup

Pick which credentials to set up — Google OAuth (for Gmail), Slack bot token, and/or LLM provider — and the values are written to ~/.config/transcript/env. Re-run any time to add or rotate values; existing values are preserved unless you confirm overwrite (or pass --force).

Step 2: Initialise a project (per directory)

From the directory where you want transcripts to land:

transcribe-it init

This asks which sources to enable, source-specific config (sender filter, channel ID, etc.), output path, and lookback window. Writes .transcripts/config.yaml. No secrets prompts — it'll warn if the credentials a chosen source needs aren't set yet.

Gmail credentials

setup asks for GOOGLE_OAUTH_CLIENT_ID and GOOGLE_OAUTH_CLIENT_SECRET. Two options:

  1. Reuse someone else's OAuth client — ask a teammate for the values and have them add your Google account as a Test user on their OAuth consent screen.
  2. Create your own — in Google Cloud Console, create an OAuth 2.0 Client ID of type Desktop app, then copy the client ID and secret from the resulting credentials.

After setup and init, authenticate:

transcribe-it auth gmail

Slack credentials

setup asks for SLACK_BOT_TOKEN (xoxb-...). The bot needs to be a member of the channels you want to ingest from. The channel ID itself is configured per-project in init.

Usage

By default, ingestion only extracts the raw transcript — no LLM call, no API key required. Pass --enrich to also generate a summary, topics, and participants via LLM.

# Last N days, raw extraction only (default)
transcribe-it ingest gmail --days 7

# With LLM enrichment
transcribe-it ingest gmail --days 7 --enrich

# Enrichment + cleaned transcript variant (--clean implies --enrich)
transcribe-it ingest gmail --days 7 --clean

# Specific date range
transcribe-it ingest gmail --from 2026-04-01 --to 2026-04-05

# Preview matching emails without fetching or writing
transcribe-it ingest gmail --days 1 --dry-run

# Ingest a single transcript file directly
transcribe-it ingest file path/to/transcript.txt

Output

Raw mode (default) writes a single .txt file per transcript:

.transcripts/
  2026-04-09-ai-labs-daily.txt

With --enrich, each transcript becomes a folder:

.transcripts/
  2026-04-09-ai-labs-daily/
    raw.txt          # Original transcript (immutable)
    metadata.json    # Source, date, participants, topics, summary

With --clean, an additional clean.md is written (structured: title, summary, topics, cleaned transcript).

Prompts

LLM prompts are bundled with the package under transcribe_it/prompts/. To customise, fork the repo and edit prompts/enrich.md.

Commands

Command Description
transcribe-it setup Configure global credentials (OAuth, LLM, Slack token)
transcribe-it init Initialise project config (sources, output path, lookback)
transcribe-it auth gmail Authenticate with Gmail (OAuth)
transcribe-it ingest gmail Ingest transcripts from Gmail
transcribe-it ingest file PATH Ingest a single transcript file

Ingest options (Gmail)

Flag Description
--days N How many days back to search
--from YYYY-MM-DD Start date
--to YYYY-MM-DD End date
--profile NAME Gmail auth profile
--dry-run List matching emails without processing
--enrich Run LLM enrichment (summary, topics, participants)
--clean Also generate a cleaned version of the transcript (implies --enrich)

Configuration files

Path Purpose
.transcripts/config.yaml Per-project: sources, lookback, output destinations
~/.config/transcript/env Global: API keys and OAuth credentials

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

transcribe_it-0.3.0.tar.gz (140.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

transcribe_it-0.3.0-py3-none-any.whl (19.6 kB view details)

Uploaded Python 3

File details

Details for the file transcribe_it-0.3.0.tar.gz.

File metadata

  • Download URL: transcribe_it-0.3.0.tar.gz
  • Upload date:
  • Size: 140.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.8

File hashes

Hashes for transcribe_it-0.3.0.tar.gz
Algorithm Hash digest
SHA256 95e80dec3e3b9eed30acc43f4486675d887ba23e65826e027986977d96d392f8
MD5 a3e374b31dcea51d519e7624e73d5e7f
BLAKE2b-256 2bdffcbccde6380f45ef35f3faea5adb2a76754de1cba84f555fd0a82519f9a7

See more details on using hashes here.

File details

Details for the file transcribe_it-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for transcribe_it-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 906d5222633299ef89540df620d55b71cb539e90e03a8c5b4ad1f5ee0a70a918
MD5 fcf9e8a5b68c8799dd7982cc1510f5b8
BLAKE2b-256 bee39b2fe8bf8b9c27b125e502a986e3df1b72771f7f7fab6c1e29f29427bd2a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page