Lightweight CLI for ingesting, enriching, and storing meeting transcripts
Project description
transcribe-it
A lightweight CLI for ingesting meeting transcripts (Gmail or Slack), enriching them with an LLM, and storing the results as local files.
Source -> Extract -> LLM Enrich -> Local Files
Prerequisites
- Python 3.12+
- A Google Cloud project with Gmail API + Google Drive API enabled (for the Gmail source), or a Slack bot token (for the Slack source)
- An API key for one of the supported LLM providers (Anthropic, OpenAI, or Groq) — only needed if you want LLM enrichment
Install
uv tool install transcribe-it
Or with pipx:
pipx install transcribe-it
Setup
Setup is split into two steps: a one-off global step for credentials, and a per-project step for what to ingest.
Step 1: Configure credentials (once per machine)
transcribe-it setup
Pick which credentials to set up — Google OAuth (for Gmail), Slack bot token, and/or LLM provider — and the values are written to ~/.config/transcript/env. Re-run any time to add or rotate values; existing values are preserved unless you confirm overwrite (or pass --force).
Step 2: Initialise a project (per directory)
From the directory where you want transcripts to land:
transcribe-it init
This asks which sources to enable, source-specific config (sender filter, channel ID, etc.), output path, and lookback window. Writes .transcripts/config.yaml. No secrets prompts — it'll warn if the credentials a chosen source needs aren't set yet.
Gmail credentials
setup asks for GOOGLE_OAUTH_CLIENT_ID and GOOGLE_OAUTH_CLIENT_SECRET. Two options:
- Reuse someone else's OAuth client — ask a teammate for the values and have them add your Google account as a Test user on their OAuth consent screen.
- Create your own — in Google Cloud Console, create an OAuth 2.0 Client ID of type Desktop app, then copy the client ID and secret from the resulting credentials.
After setup and init, authenticate:
transcribe-it auth gmail
Slack credentials
setup asks for SLACK_BOT_TOKEN (xoxb-...). The bot needs to be a member of the channels you want to ingest from. The channel ID itself is configured per-project in init.
Usage
By default, ingestion only extracts the raw transcript — no LLM call, no API key required. Pass --enrich to also generate a summary, topics, and participants via LLM.
# Last N days, raw extraction only (default)
transcribe-it ingest gmail --days 7
# With LLM enrichment
transcribe-it ingest gmail --days 7 --enrich
# Enrichment + cleaned transcript variant (--clean implies --enrich)
transcribe-it ingest gmail --days 7 --clean
# Specific date range
transcribe-it ingest gmail --from 2026-04-01 --to 2026-04-05
# Preview matching emails without fetching or writing
transcribe-it ingest gmail --days 1 --dry-run
# Ingest a single transcript file directly
transcribe-it ingest file path/to/transcript.txt
Output
Raw mode (default) writes a single .txt file per transcript:
.transcripts/
2026-04-09-ai-labs-daily.txt
With --enrich, each transcript becomes a folder:
.transcripts/
2026-04-09-ai-labs-daily/
raw.txt # Original transcript (immutable)
metadata.json # Source, date, participants, topics, summary
With --clean, an additional clean.md is written (structured: title, summary, topics, cleaned transcript).
Prompts
LLM prompts are bundled with the package under transcribe_it/prompts/. To customise, fork the repo and edit prompts/enrich.md.
Commands
| Command | Description |
|---|---|
transcribe-it setup |
Configure global credentials (OAuth, LLM, Slack token) |
transcribe-it init |
Initialise project config (sources, output path, lookback) |
transcribe-it auth gmail |
Authenticate with Gmail (OAuth) |
transcribe-it ingest gmail |
Ingest transcripts from Gmail |
transcribe-it ingest file PATH |
Ingest a single transcript file |
Ingest options (Gmail)
| Flag | Description |
|---|---|
--days N |
How many days back to search |
--from YYYY-MM-DD |
Start date |
--to YYYY-MM-DD |
End date |
--profile NAME |
Gmail auth profile |
--dry-run |
List matching emails without processing |
--enrich |
Run LLM enrichment (summary, topics, participants) |
--clean |
Also generate a cleaned version of the transcript (implies --enrich) |
Configuration files
| Path | Purpose |
|---|---|
.transcripts/config.yaml |
Per-project: sources, lookback, output destinations |
~/.config/transcript/env |
Global: API keys and OAuth credentials |
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file transcribe_it-0.3.0.tar.gz.
File metadata
- Download URL: transcribe_it-0.3.0.tar.gz
- Upload date:
- Size: 140.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
95e80dec3e3b9eed30acc43f4486675d887ba23e65826e027986977d96d392f8
|
|
| MD5 |
a3e374b31dcea51d519e7624e73d5e7f
|
|
| BLAKE2b-256 |
2bdffcbccde6380f45ef35f3faea5adb2a76754de1cba84f555fd0a82519f9a7
|
File details
Details for the file transcribe_it-0.3.0-py3-none-any.whl.
File metadata
- Download URL: transcribe_it-0.3.0-py3-none-any.whl
- Upload date:
- Size: 19.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
906d5222633299ef89540df620d55b71cb539e90e03a8c5b4ad1f5ee0a70a918
|
|
| MD5 |
fcf9e8a5b68c8799dd7982cc1510f5b8
|
|
| BLAKE2b-256 |
bee39b2fe8bf8b9c27b125e502a986e3df1b72771f7f7fab6c1e29f29427bd2a
|