Extract structured knowledge from conversation transcripts

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

danieliser

These details have not been verified by PyPI

Development Status
- 4 - Beta
Environment
- Console
Intended Audience
- Developers
Programming Language
- Python :: 3
Topic
- Text Processing

Project description

minutes

Distill any conversation into structured knowledge.

Local-first CLI for extracting decisions, ideas, questions, action items, concepts, and key terms from any conversation transcript. Works with Claude Code sessions, meeting transcripts, plain text, and markdown. Runs entirely offline with a local LLM—nothing leaves your machine.

Quick Start

# Using uv (recommended)
uv tool install take-minutes

# Or with pip
pip install take-minutes

# Run it
minutes process my-session.jsonl

See output in ./output/:

Markdown file with structured knowledge
SQLite index for future searches
Optionally: semantic embeddings for cross-session discovery

What It Extracts

Category	Description	Example
Decisions	What was decided and why	"Use SQLite instead of PostgreSQL for MVP (reason: no schema migrations needed)"
Ideas	Concepts, suggestions, opportunities	"Implement quiet hours to prevent 2am notifications"
Questions	Open issues needing resolution	"What's the deployment target—Raspberry Pi or cloud?"
Action Items	Tasks assigned with owners	"Write health endpoint monitor (owner: ops-pragmatist, deadline: Phase 1)"
Concepts	Key technical or business ideas	"3-tiered autonomy model: tier 1 (audit), tier 2 (approval), tier 3 (confirmation)"
Terms	Abbreviations, jargon, domain terms	"EDA = Event-Driven Architecture"

Prerequisites

minutes uses a local LLM for extraction. You need an OpenAI-compatible inference endpoint:

Ollama: ollama.com — simple local LLM runner
LM Studio: lmstudio.ai — GUI-based local inference
vLLM: docs.vllm.ai — high-throughput serving engine
OpenAI API or any OpenAI-compatible provider (set GATEWAY_URL env var)

For best results: use a 4B–7B model (e.g., Qwen 2.5 7B, Llama 3 8B).

Installation

Core (extraction only)

# Using uv (recommended)
uv tool install take-minutes

# Or with pip
pip install take-minutes

With semantic search

# Using uv
uv tool install "take-minutes[search]"

# Or with pip
pip install "take-minutes[search]"

One-line setup (downloads embedding model)

minutes setup

The setup command pre-downloads the embedding model (~420MB) so you aren't surprised by a mid-run download.

Usage

Process a single file

# Extract from Claude Code session
minutes process session.jsonl

# Extract from meeting transcript
minutes process meeting.txt -o ./my-minutes

# Skip deduplication check (force reprocess)
minutes process session.jsonl --no-dedup

# Verbose output for debugging
minutes process session.jsonl -v

Output:

output/YYYY-MM-DD-HH-MM-SS.md — structured knowledge in markdown
output/minutes.db — SQLite index with full-text search
output/sessions.json — metadata log for easy inspection

Batch process Claude Code sessions

Scan ~/.claude/projects/ and extract from all main-thread sessions:

# Process sessions from last 2 weeks, sorted by date (newest first)
minutes batch

# Filter by project key (substring match)
minutes batch --project persistence

# Change time range (ISO date or relative: 7d, 2w, 1m)
minutes batch --since 2w --sort size

# Dry run: show what would be processed
minutes batch --dry-run --min-size 100KB

# Skip embedding generation
minutes batch --no-embed

Output:

~/.claude/minutes/{project_key}/ — project-specific minutes
~/.claude/minutes/{project_key}/minutes.db — indexed extractions

Search across all processed sessions

# Keyword search
minutes search "budget decision"

# Filter by category (decision, idea, question, action_item, concept, term)
minutes search "authentication" --category decision

# Vector (semantic) search
minutes search "how do we handle failures?" --mode vector

# Hybrid (keyword + vector) search
minutes search "persistence strategy" --mode hybrid --limit 5

# Search specific project
minutes search "deployment" --project persistence

Returns ranked results with scores, context, and source session.

View configuration

minutes config
minutes config --env

Shows active settings: gateway URL, model, output directory, chunking parameters.

Supported Formats

Claude Code JSONL (native) — ~/.claude/projects/*/session.jsonl
Plain text / Markdown — conversation transcripts, meeting notes
Coming soon: ChatGPT export, Codex CLI, Cline, Cursor

Auto-detection: process command detects format by file extension or content. Override with --format:

minutes process transcript.srt --format text

Configuration

Create a .env file in your working directory:

# LLM backend
GATEWAY_URL=http://localhost:8800/v1
GATEWAY_MODEL=qwen3-4b

# Output
OUTPUT_DIR=./minutes-output/

# Chunking (for long transcripts)
MAX_CHUNK_SIZE=12000      # tokens per chunk
CHUNK_OVERLAP=200         # token overlap between chunks

# Retry logic
MAX_RETRIES=3

# Glossary (optional YAML file for cross-referencing)
GLOSSARY_PATH=./glossary.yaml

# Prompts (optional file paths for custom extraction prompts)
SYSTEM_PROMPT=./prompts/system.txt
EXTRACTION_PROMPT=./prompts/extraction.txt

# Debug
VERBOSE=true

View active config:

minutes config

Architecture

Input (JSONL / TXT / MD)
         ↓
    Parser (extract messages/text)
         ↓
   Chunker (split into LLM-friendly chunks)
         ↓
  LLM Extraction (local model extracts structured data)
         ↓
Deduplication (fuzzy match across chunks)
         ↓
  SQLite Index (FTS5 full-text search)
         ↓
Embeddings (sentence-transformers, optional)
         ↓
Semantic Search (FAISS + embeddings)

How It Works

Parse: Read input file (JSONL, plaintext, markdown) and extract dialogue or transcript text
Chunk: Split text into overlapping chunks to fit LLM context window (~12K tokens by default)
Extract: Send each chunk to local LLM with structured schema; collect decisions, ideas, questions, action items, concepts, terms
Deduplicate: Fuzzy-match extracted items across chunks; keep unique items
Index: Store in SQLite with FTS5 indexing for keyword search; optionally generate embeddings for semantic/hybrid search

Output is a single markdown file with all extractions, plus a queryable SQLite database.

Example Workflow

Scenario: Strategic planning sessions

Extract from session:

minutes process 2026-02-22-strategy.jsonl -o ./strategy-minutes

Review markdown:

cat ./strategy-minutes/2026-02-22-07-32-38.md

Cross-reference with glossary (if provided):

# Your GLOSSARY_PATH contains definitions for SRE, NIST AI RMF, EU AI Act
# Output markdown shows which extracted concepts are in glossary vs unknown

Batch process all recent sessions:

minutes batch --since 1m --project strategy --sort date

Search for related decisions across all sessions:

minutes search "idempotency" --category decision --mode hybrid

Requirements

Python 3.10+
Local LLM running on GATEWAY_URL (default: http://localhost:8800/v1)
Optional: Sentence Transformers for semantic search (pip install "take-minutes[search]")

Tips

Cold start: First extraction takes longer due to model loading. Subsequent runs are faster.
Large transcripts: Automatically chunks long inputs; adjust MAX_CHUNK_SIZE if needed.
Private data: All processing is local; nothing sent to external APIs (unless you configure GATEWAY_URL to an external provider).
Incremental indexing: Reprocessing same file is skipped unless you use --no-dedup.
Searching: Use --mode hybrid for best results (combines keyword + semantic search).

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

danieliser

These details have not been verified by PyPI

Development Status
- 4 - Beta
Environment
- Console
Intended Audience
- Developers
Programming Language
- Python :: 3
Topic
- Text Processing

Release history Release notifications | RSS feed

0.4.0

Feb 25, 2026

0.3.0

Feb 25, 2026

This version

0.2.0

Feb 24, 2026

0.1.1

Feb 24, 2026

0.1.0

Feb 24, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

take_minutes-0.2.0.tar.gz (42.8 kB view details)

Uploaded Feb 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

take_minutes-0.2.0-py3-none-any.whl (33.0 kB view details)

Uploaded Feb 24, 2026 Python 3

File details

Details for the file take_minutes-0.2.0.tar.gz.

File metadata

Download URL: take_minutes-0.2.0.tar.gz
Upload date: Feb 24, 2026
Size: 42.8 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for take_minutes-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`35052b6dd8fc3e41e05fbe6dcc65371e6d9f4129d5248b07dde5b62e1cc6ba1d`
MD5	`fa1e0117f7a847e9b27ba230448d51b2`
BLAKE2b-256	`28d53060c3e859324670e144a0c4c6d54df040fb1425a5bb0aefec7bb72cf89b`

See more details on using hashes here.

Provenance

The following attestation bundles were made for take_minutes-0.2.0.tar.gz:

Publisher: publish.yml on danieliser/take-minutes

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: take_minutes-0.2.0.tar.gz
- Subject digest: 35052b6dd8fc3e41e05fbe6dcc65371e6d9f4129d5248b07dde5b62e1cc6ba1d
- Sigstore transparency entry: 984986545
- Sigstore integration time: Feb 24, 2026
Source repository:
- Permalink: danieliser/take-minutes@5dcec9c9a5ddaf89ffaa2b90f34dc8b700f9a072
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/danieliser
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@5dcec9c9a5ddaf89ffaa2b90f34dc8b700f9a072
- Trigger Event: push

File details

Details for the file take_minutes-0.2.0-py3-none-any.whl.

File metadata

Download URL: take_minutes-0.2.0-py3-none-any.whl
Upload date: Feb 24, 2026
Size: 33.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for take_minutes-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`90a1a6c6e2e8bab5c15de23910e9657348f4aa02e10bb0f488b58c3258ff25eb`
MD5	`9247466a71b808910a8bff2a471f2770`
BLAKE2b-256	`192641c86e53cebf85b66b4f0347ea1209df5ef4015854a7ec7c23fc6ef917e1`

See more details on using hashes here.

Provenance

The following attestation bundles were made for take_minutes-0.2.0-py3-none-any.whl:

Publisher: publish.yml on danieliser/take-minutes

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: take_minutes-0.2.0-py3-none-any.whl
- Subject digest: 90a1a6c6e2e8bab5c15de23910e9657348f4aa02e10bb0f488b58c3258ff25eb
- Sigstore transparency entry: 984986547
- Sigstore integration time: Feb 24, 2026
Source repository:
- Permalink: danieliser/take-minutes@5dcec9c9a5ddaf89ffaa2b90f34dc8b700f9a072
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/danieliser
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@5dcec9c9a5ddaf89ffaa2b90f34dc8b700f9a072
- Trigger Event: push

take-minutes 0.2.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

minutes

Quick Start

What It Extracts

Prerequisites

Installation

Core (extraction only)

With semantic search

One-line setup (downloads embedding model)

Usage

Process a single file

Batch process Claude Code sessions

Search across all processed sessions

View configuration

Supported Formats

Configuration

Architecture

How It Works

Example Workflow

Scenario: Strategic planning sessions

Requirements

Tips

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance