Skip to main content

Measure agentic coding performance using Thread-Based Engineering framework

Project description

Oh My Agentic Score

Measure and visualize your agentic coding performance

Based on IndyDevDan's Thread-Based Engineering framework

PyPI License Python Docs


"If you can't measure it, you can't improve it."

In the age of AI-assisted development, agentic coding — the ability to collaborate effectively with AI agents — is becoming an essential skill for every developer. But how do you know if you're actually getting better at it? You can't improve what you can't measure.

Oh My Agentic Score (OMAS) was born from this belief. Inspired by IndyDevDan's brilliant Thread-Based Engineering framework, this project provides a concrete, data-driven way to measure and visualize your agentic coding performance.

OMAS analyzes your Claude Code session logs and scores your performance across four dimensions: parallelism, autonomy, density, and trust. Track your growth from basic conversations to fully autonomous Z-threads — and push yourself to become a better agentic developer.

Installation

One-line install (recommended)

curl -fsSL https://raw.githubusercontent.com/HwangTaehyun/oh-my-agentic-score/main/install.sh | bash

Homebrew (macOS)

brew install HwangTaehyun/tap/oh-my-agentic-score

pip / uv

pip install oh-my-agentic-score

# or with uv (faster)
uv tool install oh-my-agentic-score

Quick Start

# Scan all Claude Code sessions
omas scan

# View your report
omas report

# Launch interactive dashboard
omas dashboard

Features

Four-Dimension Scoring (0-10 scale)

Dimension Thread What It Measures
More P-thread Concurrent sessions running simultaneously (cross-session parallelism)
Longer L-thread Autonomous work duration without human intervention (idle gaps capped at 30min)
Thicker B-thread Work density (sub-agent depth, orchestration breadth, tool calls per minute, AI-written lines bonus)
Fewer Z-thread Human checkpoint reduction (ratio-only, trivial delegations excluded, Plan Mode AskUser exempt)

Seven Thread Types

Sessions are classified into one of seven thread types (highest priority first):

Z-thread  Zero-touch    Minimal human input, maximum autonomous work
B-thread  Big           Nested sub-agents (agents spawning agents)
L-thread  Long          30+ minutes autonomous stretch, 50+ tool calls
F-thread  Fusion        Similar tasks distributed to multiple agents
P-thread  Parallel      2+ concurrent agent execution paths
C-thread  Chained       Multiple human checkpoints with work between each
Base      Default       Standard conversation

CLI Commands

omas scan                    # Scan all sessions, build metrics DB
omas analyze <session-id>    # Analyze a single session
omas report                  # Full report with comparison metrics
omas trend                   # Score trends over time
omas export                  # Export JSON for dashboard
omas dashboard               # Launch Next.js dashboard
omas list                    # List discovered sessions
omas tui                     # Interactive TUI (via Trogon)
omas auth login              # OAuth login (GitHub/Google)
omas auth status             # Check auth status
omas upload --dry-run        # Preview cloud upload

Next.js Dashboard

Interactive web dashboard with:

  • Radar chart across all 4 dimensions
  • Thread type distribution pie chart
  • Score trends over time
  • Per-project breakdown
  • Session detail views

Fair Comparison System

To prevent short test sessions from skewing scores:

  • Minimum Thresholds: Sessions must have 5+ min duration, 10+ tool calls, 1+ human messages
  • Weighted Scoring: Longer, more complex sessions get proportionally more weight
  • Consistency Score: Measures score stability across recent sessions
  • Composite Rank: weighted_score * 0.8 + consistency * 0.2

Textual TUI

Interactive terminal UI powered by Textual and Trogon:

omas tui    # Opens form-based CLI interface

Architecture

Claude Code JSONL logs (~/.claude/projects/)
        │
        ▼
   omas scan          Parse & analyze all sessions
        │
        ├─► SQLite DB (~/.omas/metrics.db)     Local storage (always)
        │
        ├─► metrics.json                        Dashboard data
        │
        └─► Cloud upload (optional)             Background sync
             └─► upload_queue.json              Retry queue on failure

Offline-First Design

  • Analysis results always save to local SQLite first
  • Cloud upload is automatic but optional
  • Network failures queue data for retry (max 5 attempts)
  • Dashboard works entirely from local data

Privacy

  • Project paths are hashed before cloud upload (no directory names exposed)
  • Session IDs retained for deduplication only
  • No source code or file contents are ever transmitted

How It Works

OMAS parses Claude Code's JSONL session logs from ~/.claude/projects/. For each session:

  1. Parser extracts tool calls, user messages, sub-agent events, and token usage
  2. Metrics compute four independent dimension scores using log-normalized scaling
  3. Classifier determines the thread type using priority-based rules
  4. Storage persists to SQLite for historical tracking
  5. Display renders via Rich terminal UI or Next.js dashboard

Key Algorithms

  • Cross-session sweep-line for peak concurrent session detection (parallelism) — avoids over-counting from pairwise overlap
  • Linear agent counting for density scoring (total agents capped at 10, plus AI-written lines bonus)
  • Activity-based autonomy measurement (measures to Claude's last activity, not next human message)
  • Idle gap capping at 30 minutes to prevent inflated autonomy scores from idle periods
  • Jaccard similarity for fusion thread detection
  • Log normalization (log1p(x) * k) for unbounded metrics (0-10 scale; k=2.0 for autonomy/trust)
  • Trivial delegation filter excludes simple human commands (≤5 tool calls) from trust ratio
  • Plan Mode awareness exempts AskUserQuestion during planning from penalty
  • Human message filtering with 24 automated patterns + minimum length (3 chars)

Improving Your Score

Base → C-thread    Use 3+ conversation turns with work between each
C    → P-thread    Request parallel tasks ("analyze these 3 files simultaneously")
P    → L-thread    Give detailed instructions, let Claude work 30+ minutes
L    → B-thread    Use teams/worktrees for deep sub-agent hierarchies
B    → Z-thread    One command, full feature implementation, auto-approve

Key tips:

  • Write a detailed CLAUDE.md with project conventions
  • Give complete requirements upfront instead of incremental instructions
  • Enable auto-approve for permissions to avoid interruptions
  • Use Agent tool for independent parallel work

Development

git clone https://github.com/HwangTaehyun/oh-my-agentic-score.git
cd oh-my-agentic-score
uv sync
uv run omas --help

See CONTRIBUTING.md for full development setup.

Documentation

Full documentation is available at hwangtaehyun.github.io/oh-my-agentic-score.

Credits

Tip

Be with us!

Issues  Found a bug or have a feature idea? Open an issue — all feedback is welcome.

Email  Reach out directly for questions, suggestions, or collaboration.

GitHub Follow  Follow @HwangTaehyun on GitHub for more projects.

License

MIT - Taehyun Hwang

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

oh_my_agentic_score-0.7.5.tar.gz (671.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

oh_my_agentic_score-0.7.5-py3-none-any.whl (496.9 kB view details)

Uploaded Python 3

File details

Details for the file oh_my_agentic_score-0.7.5.tar.gz.

File metadata

  • Download URL: oh_my_agentic_score-0.7.5.tar.gz
  • Upload date:
  • Size: 671.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for oh_my_agentic_score-0.7.5.tar.gz
Algorithm Hash digest
SHA256 f29c96bd83791f726bd943d276b643eb507c68ae67bdb7c9dfcbfd9d78842a6c
MD5 2eaf169d87d584190b6a6fc7f830a80e
BLAKE2b-256 241fab816846aa468490c16dea49fbddc1bf4429aa5357732a32a13d40056ca0

See more details on using hashes here.

Provenance

The following attestation bundles were made for oh_my_agentic_score-0.7.5.tar.gz:

Publisher: release.yml on HwangTaehyun/oh-my-agentic-score

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file oh_my_agentic_score-0.7.5-py3-none-any.whl.

File metadata

File hashes

Hashes for oh_my_agentic_score-0.7.5-py3-none-any.whl
Algorithm Hash digest
SHA256 9499d129114c49b6cd1ed4aadf96ac0e64538b8f09550c48ea986ec6ee7d153f
MD5 282dc4e7ce8130b820a73d9cb4cb135a
BLAKE2b-256 3a672909cbcf74c9ae5ec8ff1eeaf411308fdc81345affeb1c0bd9edb4c61866

See more details on using hashes here.

Provenance

The following attestation bundles were made for oh_my_agentic_score-0.7.5-py3-none-any.whl:

Publisher: release.yml on HwangTaehyun/oh-my-agentic-score

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page