Log Claude Code sessions and shared issues to DuckDB + MotherDuck
Project description
claude-session-logger
Captures every Claude Code session into a local DuckDB file and syncs it to a cloud DuckDB (MotherDuck), so you can trace sessions over time — cost, tokens, duration, project, session type — plus on-demand structured log entries (findings, tasks, decisions, issues, notes) you ask Claude to record mid-session.
Local DuckDB is the source of truth. The cloud copy is a synced mirror. Sessions are logged automatically at session end; log entries are written on demand.
How it works
Claude Code session ends
│
▼
SessionEnd hook (~/.claude/settings.json)
│ passes session_id + transcript_path + cwd on stdin (JSON)
▼
log_session.py record --no-sync
│ 1. parse transcript JSONL → aggregate tokens by model, derive metadata
│ 2. compute cost from prices.json (incl. cache-write/read rates)
│ 3. UPSERT one row into LOCAL sessions table (synced = FALSE)
▼
launchctl kickstart …flush (fires asynchronously, off the hot path)
▼
LaunchAgent → log_session.py flush
│ push all synced = FALSE rows → MotherDuck, mark synced = TRUE
▼
sessions.duckdb ──sync──▶ md:claude_sessions (cloud)
Session end writes locally and returns fast (--no-sync); the cloud push runs
out-of-band via a launchd LaunchAgent. This keeps session exit instant and avoids
the script blocking on a slow/offline network. On the next SessionStart, the
flush agent is kicked again to drain anything still queued, and a banner of recent
sessions is printed.
Concurrency
DuckDB allows one read-write process or many read-only processes per file.
Concurrent Claude sessions plus the flush LaunchAgent contend for the write lock,
so db.connect() retries with exponential backoff on a transient
Conflicting lock is held error. The startup banner (recent) connects
read-only so it coexists with a writer and degrades silently if the file is
missing, locked past retries, or has no tables yet.
Files
Code is a pip package under src/claude_session_logger/. Installing exposes three
console commands: claude-session-logger (cli.py),
claude-session-logger-setup (_setup.py), and claude-memory-sync
(memory.py, the separate .md memory system).
File (src/claude_session_logger/) |
Purpose |
|---|---|
cli.py |
CLI entry point. Subcommands: init, record, log, resolve, list, show, recent, inbox, flush. |
db.py |
DuckDB connect (with lock retry), schema DDL + idempotent migrations, CRUD helpers, token resolution + self-heal, bidirectional sync(), inbox/meta helpers. |
transcript.py |
Parse a session transcript .jsonl → tokens by model, models, timestamps, skills, message/tool counts, metadata. |
cost.py |
Compute USD cost from per-model token usage and prices.json. |
classify.py |
Infer session_type from skills used + project/cwd. |
identity.py |
Resolve the current username ($CLAUDE_LOGGER_USER → git config user.name) for created_by/updated_by. |
resolve.py |
Resolve the active session_id from the newest .jsonl in the cwd's project dir. |
memory.py |
.md file-memory ↔ MotherDuck (md:claude_memory); push/pull/sync/search/get/hook. |
_setup.py |
Post-install wiring: LaunchAgent + idempotent hook merge + token check. |
prices.json |
Editable model → {input, output, cache_write, cache_read} per-1M-token rates. |
sessions.duckdb |
Local buffer / source of truth (created at ~/.claude/claude-session-logger/, gitignored). |
CLAUDE.md / DESIGN.md |
Contributor guide (incl. security guardrails) / as-built architecture. |
tests/ |
pytest suite for each module + the CLI + setup. |
Legacy note: examples below use
claude-session-logger <cmd>; the equivalentpython3 log_session.py <cmd>still works from a source checkout.
Setup
-
Install (pulls in DuckDB, pinned to 1.5.3 — MotherDuck's max):
pip install . # or `pip install -e .` for a dev/editable install
This puts three commands on your PATH:
claude-session-logger,claude-session-logger-setup,claude-memory-sync.Quick path: after the token step below, run
claude-session-logger-setup— it creates the local schema, writes the flush LaunchAgent, and merges the hooks into~/.claude/settings.json(idempotent; backs upsettings.json.bak; migrates old script-path hooks). Steps 3–5 below are what it automates, for reference. -
MotherDuck token — cloud sync resolves a token from, in order: the
motherduck_tokenenv var,MOTHERDUCK_TOKEN, then the file~/.config/motherduck/token(chmod 600). That path is MotherDuck's own credential store — themotherduckCLI / device-auth writes it, so the scripts, the flush LaunchAgent, and official tooling all read one file (no second copy to drift out of sync on rotation). The file matters because launchd does not inherit your shell exports — the flush LaunchAgent relies on it, so without it sync would fall back to interactive browser device-auth (and hang). If a stale shell/uppercase token outranks a freshly rotated file token, the ATTACH self-heals by retrying once with the file token. Without any token, local writes still work and sync raises a clean error (logged tosync.log, retried later). -
Create the schema:
claude-session-logger init -
Wire the hooks into
~/.claude/settings.json(merge — don't replace existing hooks).claude-session-logger-setupwrites these for you; shown here for reference (commands invokepython -m claude_session_logger.cli …so launchd's minimal PATH still resolves the package):"SessionStart": [ { "hooks": [ { "type": "command", "command": "python3 -m claude_session_logger.cli recent" }, { "type": "command", "command": "python3 -m claude_session_logger.cli inbox --count" }, { "type": "command", "command": "launchctl kickstart gui/$(id -u)/com.keithfajardo.claude-session-logger.flush" }, { "type": "command", "command": "python3 -m claude_session_logger.memory pull 2>&1 || true" } ]} ], "PostToolUse": [ { "matcher": "Write|Edit", "hooks": [ { "type": "command", "command": "python3 -m claude_session_logger.memory hook 2>/dev/null || true", "async": true } ]} ], "SessionEnd": [ { "hooks": [ { "type": "command", "command": "python3 -m claude_session_logger.cli record --no-sync; launchctl kickstart gui/$(id -u)/com.keithfajardo.claude-session-logger.flush" } ]} ]The
inbox --countline prints the shared-issue badge (silent when nothing is unseen). -
LaunchAgent (
~/Library/LaunchAgents/com.keithfajardo.claude-session-logger.flush.plist) runsclaude-session-logger flushwhen kickstarted, pushing queued rows to the cloud off the session-end hot path. Sinceflushdoes a bidirectionallog_entriesmerge, this is also what pulls other users' resolves/reopens down to this device. The plist embeds no token — the code resolves it from the token file in-process. -
Username (optional) — set
CLAUDE_LOGGER_USERin the hook environment to control the name recorded increated_by/updated_by. If unset, it falls back togit config user.name. Needed only if you share the issue table.
Usage
# Create local + cloud schema (idempotent)
python3 log_session.py init
# Record a session (reads hook JSON on stdin). --no-sync = local write only.
python3 log_session.py record [--no-sync]
# Flush queued (synced = FALSE) rows to the cloud
python3 log_session.py flush
# Print the most recent sessions banner (read-only)
python3 log_session.py recent [--limit 3]
# Inbox of shared-issue changes by other users
python3 log_session.py inbox # list changes, clear the badge
python3 log_session.py inbox --count # badge only (read-only; used by SessionStart)
On-demand log entries
Tell Claude in plain language and it runs the right command. session_id auto-resolves from the active transcript, so this works from any repo.
What to say → what runs:
| Say | Runs |
|---|---|
| "log an issue: <title> — <details>" | log --category issue --status open … |
| "log a finding: <title>" | log --category finding --status open … |
| "log a task: <title>" | log --category task --status open … |
| "log a decision: <title>" | log --category decision --status info … |
| "log a note: <title>" | log --category note --status info … |
| "resolve the <words> issue" | resolve --title-match "<words>" --status resolved |
| "resolve <id> [because …]" | resolve --id <id> --status resolved [--note …] |
| "set <id> to blocked / in progress" | resolve --id <id> --status <status> |
| "list open issues" / "what did I log" | list [--category …] [--open] … |
| "check my inbox" / "what changed" | inbox (lists others' changes, clears the badge) |
Default status: issue/finding/task → open; decision/note → info. Add "… status
blocked" (etc.) to override. The headline goes in --title, extra context in
--body.
# Add an entry. session_id auto-resolved from the active transcript if omitted.
# Prints the new id: "logged issue (open) id=<uuid>" — keep it to resolve later.
python3 log_session.py log \
--category issue --status open \
--title "staging model double-counts refunds" \
--body "details…" [--session-id <id>]
# Resolve / update an existing entry (appends a note to the body if given).
# By words — no UUID needed; must match exactly one entry, else it lists candidates.
python3 log_session.py resolve --title-match "double-counts refunds" --status resolved \
[--note "fixed in PR #42"]
# Or by exact id
python3 log_session.py resolve --id <entry-id> --status resolved [--note "fixed in PR #42"]
# List entries, newest first (by updated_at, falling back to logged_at).
# Use this to grab an id when a title match is ambiguous.
python3 log_session.py list [--category issue] [--status open] [--open] [--limit N]
# Show ONE entry's full body (read-only). By id / id prefix, or title words.
# Use this instead of a hand-written duckdb query — `list` omits the body.
python3 log_session.py show <id-or-prefix>
python3 log_session.py show --title-match "double-counts refunds"
--open excludes status = 'resolved' entries. show accepts a short id prefix
(e.g. b045dd70) and prints the id/type/status/title/timestamps/body; if a title
match is ambiguous it lists the candidates instead. resolve takes either
--title-match (resolve by words; errors if it matches 0 or >1 entries) or
--id (exact).
Categories: finding | task | decision | issue | note
Statuses: open | in_progress | blocked | resolved | info
Sharing the issue table (multi-user)
The log_entries table is shared and bidirectional: install the logger on a
second device/account and both people see and resolve the same issues. sessions
stay private (push-only) — only log_entries is pulled back down.
Identity. Every entry records who created/changed it. Username resolves from
$CLAUDE_LOGGER_USER → git config user.name → "unknown":
created_by— who logged the entry.updated_by— who last changed its status (e.g. who resolved it).
Sync model. flush/record run a last-write-wins merge (newest
updated_at wins) of log_entries between local and md:claude_sessions. So
Jane can resolve an issue Keith opened; on Keith's next sync the resolve lands
locally with updated_by = "Jane". Concurrent edits to the same entry: latest
timestamp wins (low-contention assumption — like the rest of the system).
Inbox — change notifications
When someone else resolves or reopens an issue, you get a count badge at session start, then read the details on demand.
# Badge (wired into the SessionStart hook): read-only, prints e.g.
# 📬 2 shared issue update(s) — run: …log_session.py inbox
# Silent when nothing is unseen. Never blocks; reads local state only.
python3 log_session.py inbox --count
# Inbox view: lists changes made by others since you last looked,
# then advances the "seen" watermark (clears the badge).
python3 log_session.py inbox
A change counts as unseen if COALESCE(updated_at, logged_at) is newer than the
local watermark and the actor (updated_by, else created_by) isn't you. The
watermark lives in a local-only meta table — it is not synced, so "seen" is
per-device. The badge reads whatever the last sync pulled down (kept current by
the flush LaunchAgent), so it never makes a network call on the session-start hot
path.
Real-time push (email/Slack) is intentionally out of scope — the badge + inbox is a pull-and-diff, no extra infrastructure.
Pulling memory (local & remote)
"Memory" spans two separate systems. Know which one you want before pulling.
| System A — Session memory (this repo) | System B — File-based memory (sibling) | |
|---|---|---|
| What | Sessions + on-demand log entries | Markdown memory files Claude writes |
| Local store | ~/.claude/claude-session-logger/sessions.duckdb |
~/.claude/projects/<slug>/memory/*.md |
| Remote store | MotherDuck md:claude_sessions |
MotherDuck md:claude_memory |
| Tool | log_session.py (this repo) |
~/.claude/scripts/memory_sync.py |
System A — session memory (this repo)
Pull local — read straight from sessions.duckdb:
# On-demand log entries (findings/tasks/decisions/issues/notes), newest first
python3 log_session.py list [--category finding] [--status open] [--open] [--limit N]
# Recent sessions banner
python3 log_session.py recent [--limit 5]
# Or query the local file directly
duckdb ~/.claude/claude-session-logger/sessions.duckdb \
"SELECT session_date, project, cost_usd, total_tokens FROM sessions ORDER BY ended_at DESC LIMIT 10"
Pull remote — query the MotherDuck mirror (needs MOTHERDUCK_TOKEN):
duckdb "md:?motherduck_token=$MOTHERDUCK_TOKEN" \
"SELECT category, status, title, logged_at
FROM claude_sessions.log_entries
WHERE title ILIKE '%refund%' ORDER BY logged_at DESC"
Local is the source of truth; remote is the synced mirror, useful from another
device. Both hold the same sessions + log_entries tables. sessions syncs
up only (private); log_entries syncs both ways (shared — see
Sharing the issue table).
System B — file-based memory (memory_sync.py)
Separate system: the .md memory files Claude maintains per project, mirrored to
MotherDuck md:claude_memory for cross-device access. It lives in
~/.claude/scripts/, not in this repo. Always use the script — never
hand-write inline duckdb queries (token-wasteful, fragile).
# Pull remote — grep memory by content (current project). Matching lines only.
python3 ~/.claude/scripts/memory_sync.py search "<thing>"
python3 ~/.claude/scripts/memory_sync.py search "<thing>" --project-key <repo-name>
python3 ~/.claude/scripts/memory_sync.py search "<thing>" --all-projects # every project
python3 ~/.claude/scripts/memory_sync.py search "<thing>" --full # whole files
# Pull remote — print one file's full content (name substring match)
python3 ~/.claude/scripts/memory_sync.py get <name-fragment> [--project-key <repo-name>]
# Refresh local .md files from remote (newer wins), or reconcile both ways
python3 ~/.claude/scripts/memory_sync.py pull
python3 ~/.claude/scripts/memory_sync.py sync
search/get are read-only (no local write). Workflow: search to find the file
- matching lines, then
getonly if the full file is needed. Local.mdfiles are read directly by the Claude Code harness;pull/synckeep them current across devices.
Schema
Two synced tables plus a local-only meta table. The synced column is local
only (tracks what still needs pushing); it is carried into the cloud copy but
only meaningful locally.
sessions (one row per session, auto at session end)
session_id (PK), project, cwd, session_date, started_at, ended_at,
duration_min, models, input_tokens, output_tokens, cache_write_tokens,
cache_read_tokens, total_tokens, cost_usd, message_count,
tool_call_count, skills_used, git_branch, cc_version, title,
session_type, created_by, synced.
log_entries (on-demand, shared)
id (PK, UUID), session_id, logged_at, category, status, title, body,
synced, updated_at (NULL until first resolve/update), created_by,
updated_by. create_tables() runs idempotent ADD COLUMN IF NOT EXISTS
migrations (updated_at, created_by, updated_by) for older DBs.
meta (local only — not synced)
key (PK), value. Holds the per-device inbox_watermark (last time you cleared
the inbox). Never created cloud-side.
All timestamps are stored in UTC.
Column order note: sync uses explicit column lists, never
SELECT *.ADD COLUMNappends physically at the end, so a migrated table and a freshly created one can have different column positions — positionalSELECT *across local↔cloud would silently misalign (e.g. mapcreated_byontosynced).
Cost model
Per assistant message, token usage is accumulated per model
(input, output, cache_creation → cache_write, cache_read), de-duplicated
by message uuid/id. Cost:
cost_usd = Σ_model ( input·p.input + output·p.output
+ cache_write·p.cache_write + cache_read·p.cache_read ) / 1e6
Rates live in prices.json (per 1M tokens). Seeded values:
| model | input | output | cache_write | cache_read |
|---|---|---|---|---|
| claude-opus-4-8 | 5.00 | 25.00 | 6.25 | 0.50 |
| claude-sonnet-4-6 | 3.00 | 15.00 | 3.75 | 0.30 |
| claude-haiku-4-5 | 1.00 | 5.00 | 1.25 | 0.10 |
An unknown model contributes 0 cost and is logged to sync.log so it can be added.
session_type inference (customizable)
Built-in defaults (classify.py), first match wins:
- skill mentions
dbt/kimball, or project/cwd mentions dbt →dbt - skill
systematic-debugging→debugging - skill
deep-research/brainstorming/research→research - skill
writing-plans/executing-plans→planning - else →
general
Add your own without touching code: drop a YAML file at
~/.claude/claude-session-logger/session_types.yaml (copy
session_types.example.yaml). Your rules are checked before the defaults, so they
can add a new type or override a built-in one:
- type: infra
skills: [terraform, k8s] # case-insensitive substrings of the session's skills
path: [ops/, infra/] # …or of "<project> <cwd>"
Loading is hook-safe — a missing file, malformed YAML, or absent PyYAML silently falls back to the defaults, so session-end never breaks.
Reliability
- Cloud unreachable / token missing → local write succeeds, sync fails silently,
logged to
sync.log, retried on the nextflush. - Lock contention →
connect()retries with backoff; the read-only banner degrades silently. - Malformed transcript line → skipped (counted, not fatal).
Tests
python3 -m pytest
Limitations
- Single active session per directory assumption:
resolve.pypicks the newest-mtime.jsonlin the cwd's project dir. Concurrent sessions in the same directory can mis-attribute an on-demand log entry. - No per-message conversation storage (by design — avoids dumping file contents / secrets into the cloud).
- Hooks and the LaunchAgent path are macOS / launchd specific.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file claude_session_logger-0.1.0.tar.gz.
File metadata
- Download URL: claude_session_logger-0.1.0.tar.gz
- Upload date:
- Size: 43.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d5e02422e77e017c3b5b0f1f3732320cfac25ccf2c1cae29aaec254b3854da72
|
|
| MD5 |
dfbebaf0b41d45acc91b59972894c26c
|
|
| BLAKE2b-256 |
fb19dc876ac3e78166524365834c5efe4b6ef4f7504cc9b45487e263e642fcca
|
File details
Details for the file claude_session_logger-0.1.0-py3-none-any.whl.
File metadata
- Download URL: claude_session_logger-0.1.0-py3-none-any.whl
- Upload date:
- Size: 30.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6b89e35f5d70302d1bc5cac9ae10869a01d59cc34c11f3ae9fcc96b84576de10
|
|
| MD5 |
300b499947f8bf82d0c1d6a3ea957789
|
|
| BLAKE2b-256 |
b3f5a1bf46895bbf7179480e5d589294255e4c51b2786c434132014cfafc1deb
|