Skip to main content

Log Claude Code sessions and shared issues to DuckDB + MotherDuck

Project description

claude-session-logger

Captures every Claude Code session into a local DuckDB file and syncs it to a cloud DuckDB (MotherDuck), so you can trace sessions over time — cost, tokens, duration, project, session type — plus on-demand structured log entries (findings, tasks, decisions, issues, notes) you ask Claude to record mid-session.

Local DuckDB is the source of truth. The cloud copy is a synced mirror. Sessions are logged automatically at session end; log entries are written on demand.

How it works

Claude Code session ends
        │
        ▼
SessionEnd hook (~/.claude/settings.json)
        │  passes session_id + transcript_path + cwd on stdin (JSON)
        ▼
log_session.py record --no-sync
        │  1. parse transcript JSONL → aggregate tokens by model, derive metadata
        │  2. compute cost from prices.json (incl. cache-write/read rates)
        │  3. UPSERT one row into LOCAL sessions table (synced = FALSE)
        ▼
launchctl kickstart …flush           (fires asynchronously, off the hot path)
        ▼
LaunchAgent → log_session.py flush
        │  push all synced = FALSE rows → MotherDuck, mark synced = TRUE
        ▼
sessions.duckdb  ──sync──▶  md:claude_sessions  (cloud)

Session end writes locally and returns fast (--no-sync); the cloud push runs out-of-band via a launchd LaunchAgent. This keeps session exit instant and avoids the script blocking on a slow/offline network. On the next SessionStart, the flush agent is kicked again to drain anything still queued, and a banner of recent sessions is printed.

Concurrency

DuckDB allows one read-write process or many read-only processes per file. Concurrent Claude sessions plus the flush LaunchAgent contend for the write lock, so db.connect() retries with exponential backoff on a transient Conflicting lock is held error. The startup banner (recent) connects read-only so it coexists with a writer and degrades silently if the file is missing, locked past retries, or has no tables yet.

Files

Code is a pip package under src/claude_session_logger/. Installing exposes three console commands: claude-session-logger (cli.py), claude-session-logger-setup (_setup.py), and claude-memory-sync (memory.py, the separate .md memory system).

File (src/claude_session_logger/) Purpose
cli.py CLI entry point. Subcommands: init, record, log, resolve, list, show, recent, inbox, flush.
db.py DuckDB connect (with lock retry), schema DDL + idempotent migrations, CRUD helpers, token resolution + self-heal, bidirectional sync(), inbox/meta helpers.
transcript.py Parse a session transcript .jsonl → tokens by model, models, timestamps, skills, message/tool counts, metadata.
cost.py Compute USD cost from per-model token usage and prices.json.
classify.py Infer session_type from skills used + project/cwd.
identity.py Resolve the current username ($CLAUDE_LOGGER_USERgit config user.name) for created_by/updated_by.
resolve.py Resolve the active session_id from the newest .jsonl in the cwd's project dir.
memory.py .md file-memory ↔ MotherDuck (md:claude_memory); push/pull/sync/search/get/hook.
_setup.py Post-install wiring: LaunchAgent + idempotent hook merge + token check.
prices.json Editable model → {input, output, cache_write, cache_read} per-1M-token rates.
sessions.duckdb Local buffer / source of truth (created at ~/.claude/claude-session-logger/, gitignored).
CLAUDE.md / DESIGN.md Contributor guide (incl. security guardrails) / as-built architecture.
tests/ pytest suite for each module + the CLI + setup.

Legacy note: examples below use claude-session-logger <cmd>; the equivalent python3 log_session.py <cmd> still works from a source checkout.

Setup

  1. Install (pulls in DuckDB, pinned to 1.5.3 — MotherDuck's max):

    pip install .          # or `pip install -e .` for a dev/editable install
    

    This puts three commands on your PATH: claude-session-logger, claude-session-logger-setup, claude-memory-sync.

    Quick path: after the token step below, run claude-session-logger-setup — it creates the local schema, writes the flush LaunchAgent, and merges the hooks into ~/.claude/settings.json (idempotent; backs up settings.json.bak; migrates old script-path hooks). Steps 3–5 below are what it automates, for reference.

  2. MotherDuck token — cloud sync resolves a token from, in order: the motherduck_token env var, MOTHERDUCK_TOKEN, then the file ~/.config/motherduck/token (chmod 600). That path is MotherDuck's own credential store — the motherduck CLI / device-auth writes it, so the scripts, the flush LaunchAgent, and official tooling all read one file (no second copy to drift out of sync on rotation). The file matters because launchd does not inherit your shell exports — the flush LaunchAgent relies on it, so without it sync would fall back to interactive browser device-auth (and hang). If a stale shell/uppercase token outranks a freshly rotated file token, the ATTACH self-heals by retrying once with the file token. Without any token, local writes still work and sync raises a clean error (logged to sync.log, retried later).

  3. Create the schema:

    claude-session-logger init
    
  4. Wire the hooks into ~/.claude/settings.json (merge — don't replace existing hooks). claude-session-logger-setup writes these for you; shown here for reference (commands invoke python -m claude_session_logger.cli … so launchd's minimal PATH still resolves the package):

    "SessionStart": [
      { "hooks": [
        { "type": "command", "command": "python3 -m claude_session_logger.cli recent" },
        { "type": "command", "command": "python3 -m claude_session_logger.cli inbox --count" },
        { "type": "command", "command": "launchctl kickstart gui/$(id -u)/com.keithfajardo.claude-session-logger.flush" },
        { "type": "command", "command": "python3 -m claude_session_logger.memory pull 2>&1 || true" }
      ]}
    ],
    "PostToolUse": [
      { "matcher": "Write|Edit", "hooks": [
        { "type": "command", "command": "python3 -m claude_session_logger.memory hook 2>/dev/null || true", "async": true }
      ]}
    ],
    "SessionEnd": [
      { "hooks": [
        { "type": "command", "command": "python3 -m claude_session_logger.cli record --no-sync; launchctl kickstart gui/$(id -u)/com.keithfajardo.claude-session-logger.flush" }
      ]}
    ]
    

    The inbox --count line prints the shared-issue badge (silent when nothing is unseen).

  5. LaunchAgent (~/Library/LaunchAgents/com.keithfajardo.claude-session-logger.flush.plist) runs claude-session-logger flush when kickstarted, pushing queued rows to the cloud off the session-end hot path. Since flush does a bidirectional log_entries merge, this is also what pulls other users' resolves/reopens down to this device. The plist embeds no token — the code resolves it from the token file in-process.

  6. Username (optional) — set CLAUDE_LOGGER_USER in the hook environment to control the name recorded in created_by/updated_by. If unset, it falls back to git config user.name. Needed only if you share the issue table.

Usage

# Create local + cloud schema (idempotent)
python3 log_session.py init

# Record a session (reads hook JSON on stdin). --no-sync = local write only.
python3 log_session.py record [--no-sync]

# Flush queued (synced = FALSE) rows to the cloud
python3 log_session.py flush

# Print the most recent sessions banner (read-only)
python3 log_session.py recent [--limit 3]

# Inbox of shared-issue changes by other users
python3 log_session.py inbox            # list changes, clear the badge
python3 log_session.py inbox --count    # badge only (read-only; used by SessionStart)

On-demand log entries

Tell Claude in plain language and it runs the right command. session_id auto-resolves from the active transcript, so this works from any repo.

What to say → what runs:

Say Runs
"log an issue: <title> — <details>" log --category issue --status open …
"log a finding: <title>" log --category finding --status open …
"log a task: <title>" log --category task --status open …
"log a decision: <title>" log --category decision --status info …
"log a note: <title>" log --category note --status info …
"resolve the <words> issue" resolve --title-match "<words>" --status resolved
"resolve <id> [because …]" resolve --id <id> --status resolved [--note …]
"set <id> to blocked / in progress" resolve --id <id> --status <status>
"list open issues" / "what did I log" list [--category …] [--open] …
"check my inbox" / "what changed" inbox (lists others' changes, clears the badge)

Default status: issue/finding/task → open; decision/note → info. Add "… status blocked" (etc.) to override. The headline goes in --title, extra context in --body.

# Add an entry. session_id auto-resolved from the active transcript if omitted.
# Prints the new id: "logged issue (open) id=<uuid>" — keep it to resolve later.
python3 log_session.py log \
  --category issue --status open \
  --title "staging model double-counts refunds" \
  --body "details…" [--session-id <id>]

# Resolve / update an existing entry (appends a note to the body if given).
# By words — no UUID needed; must match exactly one entry, else it lists candidates.
python3 log_session.py resolve --title-match "double-counts refunds" --status resolved \
  [--note "fixed in PR #42"]
# Or by exact id
python3 log_session.py resolve --id <entry-id> --status resolved [--note "fixed in PR #42"]

# List entries, newest first (by updated_at, falling back to logged_at).
# Use this to grab an id when a title match is ambiguous.
python3 log_session.py list [--category issue] [--status open] [--open] [--limit N]

# Show ONE entry's full body (read-only). By id / id prefix, or title words.
# Use this instead of a hand-written duckdb query — `list` omits the body.
python3 log_session.py show <id-or-prefix>
python3 log_session.py show --title-match "double-counts refunds"

--open excludes status = 'resolved' entries. show accepts a short id prefix (e.g. b045dd70) and prints the id/type/status/title/timestamps/body; if a title match is ambiguous it lists the candidates instead. resolve takes either --title-match (resolve by words; errors if it matches 0 or >1 entries) or --id (exact).

Categories: finding | task | decision | issue | note Statuses: open | in_progress | blocked | resolved | info

Sharing the issue table (multi-user)

The log_entries table is shared and bidirectional: install the logger on a second device/account and both people see and resolve the same issues. sessions stay private (push-only) — only log_entries is pulled back down.

Identity. Every entry records who created/changed it. Username resolves from $CLAUDE_LOGGER_USERgit config user.name"unknown":

  • created_by — who logged the entry.
  • updated_by — who last changed its status (e.g. who resolved it).

Sync model. flush/record run a last-write-wins merge (newest updated_at wins) of log_entries between local and md:claude_sessions. So Jane can resolve an issue Keith opened; on Keith's next sync the resolve lands locally with updated_by = "Jane". Concurrent edits to the same entry: latest timestamp wins (low-contention assumption — like the rest of the system).

Inbox — change notifications

When someone else resolves or reopens an issue, you get a count badge at session start, then read the details on demand.

# Badge (wired into the SessionStart hook): read-only, prints e.g.
#   📬 2 shared issue update(s) — run: …log_session.py inbox
# Silent when nothing is unseen. Never blocks; reads local state only.
python3 log_session.py inbox --count

# Inbox view: lists changes made by others since you last looked,
# then advances the "seen" watermark (clears the badge).
python3 log_session.py inbox

A change counts as unseen if COALESCE(updated_at, logged_at) is newer than the local watermark and the actor (updated_by, else created_by) isn't you. The watermark lives in a local-only meta table — it is not synced, so "seen" is per-device. The badge reads whatever the last sync pulled down (kept current by the flush LaunchAgent), so it never makes a network call on the session-start hot path.

Real-time push (email/Slack) is intentionally out of scope — the badge + inbox is a pull-and-diff, no extra infrastructure.

Pulling memory (local & remote)

"Memory" spans two separate systems. Know which one you want before pulling.

System A — Session memory (this repo) System B — File-based memory (sibling)
What Sessions + on-demand log entries Markdown memory files Claude writes
Local store ~/.claude/claude-session-logger/sessions.duckdb ~/.claude/projects/<slug>/memory/*.md
Remote store MotherDuck md:claude_sessions MotherDuck md:claude_memory
Tool log_session.py (this repo) ~/.claude/scripts/memory_sync.py

System A — session memory (this repo)

Pull local — read straight from sessions.duckdb:

# On-demand log entries (findings/tasks/decisions/issues/notes), newest first
python3 log_session.py list [--category finding] [--status open] [--open] [--limit N]

# Recent sessions banner
python3 log_session.py recent [--limit 5]

# Or query the local file directly
duckdb ~/.claude/claude-session-logger/sessions.duckdb \
  "SELECT session_date, project, cost_usd, total_tokens FROM sessions ORDER BY ended_at DESC LIMIT 10"

Pull remote — query the MotherDuck mirror (needs MOTHERDUCK_TOKEN):

duckdb "md:?motherduck_token=$MOTHERDUCK_TOKEN" \
  "SELECT category, status, title, logged_at
   FROM claude_sessions.log_entries
   WHERE title ILIKE '%refund%' ORDER BY logged_at DESC"

Local is the source of truth; remote is the synced mirror, useful from another device. Both hold the same sessions + log_entries tables. sessions syncs up only (private); log_entries syncs both ways (shared — see Sharing the issue table).

System B — file-based memory (memory_sync.py)

Separate system: the .md memory files Claude maintains per project, mirrored to MotherDuck md:claude_memory for cross-device access. It lives in ~/.claude/scripts/, not in this repo. Always use the script — never hand-write inline duckdb queries (token-wasteful, fragile).

# Pull remote — grep memory by content (current project). Matching lines only.
python3 ~/.claude/scripts/memory_sync.py search "<thing>"
python3 ~/.claude/scripts/memory_sync.py search "<thing>" --project-key <repo-name>
python3 ~/.claude/scripts/memory_sync.py search "<thing>" --all-projects   # every project
python3 ~/.claude/scripts/memory_sync.py search "<thing>" --full           # whole files

# Pull remote — print one file's full content (name substring match)
python3 ~/.claude/scripts/memory_sync.py get <name-fragment> [--project-key <repo-name>]

# Refresh local .md files from remote (newer wins), or reconcile both ways
python3 ~/.claude/scripts/memory_sync.py pull
python3 ~/.claude/scripts/memory_sync.py sync

search/get are read-only (no local write). Workflow: search to find the file

  • matching lines, then get only if the full file is needed. Local .md files are read directly by the Claude Code harness; pull/sync keep them current across devices.

Schema

Two synced tables plus a local-only meta table. The synced column is local only (tracks what still needs pushing); it is carried into the cloud copy but only meaningful locally.

sessions (one row per session, auto at session end)

session_id (PK), project, cwd, session_date, started_at, ended_at, duration_min, models, input_tokens, output_tokens, cache_write_tokens, cache_read_tokens, total_tokens, cost_usd, message_count, tool_call_count, skills_used, git_branch, cc_version, title, session_type, created_by, synced.

log_entries (on-demand, shared)

id (PK, UUID), session_id, logged_at, category, status, title, body, synced, updated_at (NULL until first resolve/update), created_by, updated_by. create_tables() runs idempotent ADD COLUMN IF NOT EXISTS migrations (updated_at, created_by, updated_by) for older DBs.

meta (local only — not synced)

key (PK), value. Holds the per-device inbox_watermark (last time you cleared the inbox). Never created cloud-side.

All timestamps are stored in UTC.

Column order note: sync uses explicit column lists, never SELECT *. ADD COLUMN appends physically at the end, so a migrated table and a freshly created one can have different column positions — positional SELECT * across local↔cloud would silently misalign (e.g. map created_by onto synced).

Cost model

Per assistant message, token usage is accumulated per model (input, output, cache_creationcache_write, cache_read), de-duplicated by message uuid/id. Cost:

cost_usd = Σ_model ( input·p.input + output·p.output
                   + cache_write·p.cache_write + cache_read·p.cache_read ) / 1e6

Rates live in prices.json (per 1M tokens). Seeded values:

model input output cache_write cache_read
claude-opus-4-8 5.00 25.00 6.25 0.50
claude-sonnet-4-6 3.00 15.00 3.75 0.30
claude-haiku-4-5 1.00 5.00 1.25 0.10

An unknown model contributes 0 cost and is logged to sync.log so it can be added.

session_type inference (customizable)

Built-in defaults (classify.py), first match wins:

  1. skill mentions dbt/kimball, or project/cwd mentions dbt → dbt
  2. skill systematic-debuggingdebugging
  3. skill deep-research/brainstorming/researchresearch
  4. skill writing-plans/executing-plansplanning
  5. else → general

Add your own without touching code: drop a YAML file at ~/.claude/claude-session-logger/session_types.yaml (copy session_types.example.yaml). Your rules are checked before the defaults, so they can add a new type or override a built-in one:

- type: infra
  skills: [terraform, k8s]      # case-insensitive substrings of the session's skills
  path: [ops/, infra/]          # …or of "<project> <cwd>"

Loading is hook-safe — a missing file, malformed YAML, or absent PyYAML silently falls back to the defaults, so session-end never breaks.

Reliability

  • Cloud unreachable / token missing → local write succeeds, sync fails silently, logged to sync.log, retried on the next flush.
  • Lock contentionconnect() retries with backoff; the read-only banner degrades silently.
  • Malformed transcript line → skipped (counted, not fatal).

Tests

python3 -m pytest

Limitations

  • Single active session per directory assumption: resolve.py picks the newest-mtime .jsonl in the cwd's project dir. Concurrent sessions in the same directory can mis-attribute an on-demand log entry.
  • No per-message conversation storage (by design — avoids dumping file contents / secrets into the cloud).
  • Hooks and the LaunchAgent path are macOS / launchd specific.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

claude_session_logger-0.1.0.tar.gz (43.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

claude_session_logger-0.1.0-py3-none-any.whl (30.9 kB view details)

Uploaded Python 3

File details

Details for the file claude_session_logger-0.1.0.tar.gz.

File metadata

  • Download URL: claude_session_logger-0.1.0.tar.gz
  • Upload date:
  • Size: 43.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for claude_session_logger-0.1.0.tar.gz
Algorithm Hash digest
SHA256 d5e02422e77e017c3b5b0f1f3732320cfac25ccf2c1cae29aaec254b3854da72
MD5 dfbebaf0b41d45acc91b59972894c26c
BLAKE2b-256 fb19dc876ac3e78166524365834c5efe4b6ef4f7504cc9b45487e263e642fcca

See more details on using hashes here.

File details

Details for the file claude_session_logger-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for claude_session_logger-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6b89e35f5d70302d1bc5cac9ae10869a01d59cc34c11f3ae9fcc96b84576de10
MD5 300b499947f8bf82d0c1d6a3ea957789
BLAKE2b-256 b3f5a1bf46895bbf7179480e5d589294255e4c51b2786c434132014cfafc1deb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page