Skip to main content

Static cross-harness rule coherence auditor for AI coding agents (Claude Code, Codex, ...)

Project description

ssoty

English | 한국어

PyPI CI License: MIT

Static cross-harness rule DIVERGENCE auditor for AI coding agents. Two models, one "shared" rule set — but do they actually operate under the same rules? Usually not.

ssoty reads the effective rule surfaces of multiple agent harnesses (Claude Code, Codex, Cursor, Copilot, Gemini, Cline) and shows — deterministically, with no LLM and no network — where two models diverge: which rules one model applies that the other never sees, which shared rules load under a different guarantee (always-on vs skill-gated), and which cross-references break across the boundary. It also quantifies the per-turn token cost ("Context Tax") as a secondary metric.


The problem

You point Claude Code, Codex, and Cursor at one "shared" rule set and expect identical behavior. They don't behave identically — because each harness resolves a different effective rule set. The same canonical file can:

  • load always-on in one harness (injected every turn) but skill-gated in another (loaded only when a skill triggers) — same file, unequal guarantee;
  • reference a sibling rule that exists in one harness but was never distributed to the other — a broken pointer across the boundary;
  • be duplicated across files, paying token rent every turn.

The result: the same prompt, the same repo, but different effective rules per model — so they behave inconsistently, and it's invisible until one model quietly ignores a rule you "share."

Rule divergence (the headline)

$ uvx ssoty diff examples/messy-setup --a claude-code --b codex

  claude-code  vs  codex
      only in claude-code (1): team-rules.md
      same rule, different load (1):
          shared-style.md  claude-code=always-on  |  codex=skill-gated
      broken cross-references across the boundary (1):
          codex:shared-style.md -> 'team-rules.md'  (loads only in claude-code, NOT in codex)
      VERDICT: claude-code and codex do NOT operate under the same rules
               (1 rule only in claude-code, 1 loads differently, 1 broken cross-ref)

ssoty diff answers the one question that matters: do these two models operate under the same rules? Run it across every present pair (omit --a/--b), or compare two named harnesses. --json and --redact supported; the command is strictly read-only.

What ssoty does

$ uvx ssoty audit examples/messy-setup
ssoty audit — 2 Critical, 3 Warning, 6 FYI

  [Critical] broken_symlink (claude-code)
      .../.claude/rules/broken-link.md
      symlink target does not resolve: ./nope.md

  [Critical] dangling_cross_ref (codex)
      .../.codex/skills/global-agent-rules/references/shared-style.md
      references 'team-rules.md', which exists in another harness but is NOT
      loaded by 'codex' — broken pointer across the harness boundary

  [Warning] load_asymmetry (claude-code+codex)
      shared-style.md
      same rule loads differently per harness (claude-code=always-on,
      codex=skill-gated) — shared file, unequal guarantee
  ...
  [FYI] dangling_cross_ref (codex)
      references 'meta-layout.md' (absent here, intentional per .ssotyignore)

It distinguishes a genuine broken cross-reference (Critical) from intentional non-sharing you declared in .ssotyignore (FYI) — precision over noise.

Also measures: Context Tax (token rent)

Secondary metric — the per-turn token cost of each surface and duplicate content paid every turn. Useful for before/after cleanup, but the pitch is divergence above, not token rent.

$ uvx ssoty metrics examples/messy-setup     $ uvx ssoty metrics examples/clean-setup
  claude-code:                                  claude-code:
      always-on  : 206 tokens                       always-on  : 149 tokens   (-27.7%)
  codex:                                        codex:
      skill-gated: 106 tokens                       skill-gated:   0 tokens

Numbers are reported per harness and never summed across harnesses: always-on (actual, every turn) and skill-gated (potential, only when a skill fires) are different load guarantees. Compare within one harness, before vs after a cleanup. Token counts are a deterministic char/4 heuristic by default (portable — same numbers on any machine); set SSOTY_EXACT_TOKENS=1 to opt into tiktoken.

Reproduce: uvx ssoty metrics examples/messy-setup (see benchmarks/REPORT.md).

Checks

Check Severity What it catches
broken_symlink Critical symlinked rule whose target is gone
dangling_cross_ref Critical / FYI a rule references a sibling absent in this harness (FYI if declared intentional)
load_asymmetry Warning same rule, different load basis per harness
duplicate_content Warning identical blocks duplicated across files (token rent)
non_shared_surface FYI a rule present in one harness only
skill_integrity Warning skill dir without a SKILL.md
weak_directive FYI a weak modal (should, try to, …) hedges a hard-requirement signal (never, security, …) on the same line in an always-on rule

Install

# zero-install run
uvx ssoty diff                  # cross-model rule divergence (the headline; all present pairs)
uvx ssoty audit                 # audits $HOME (~/.claude, ~/.codex)
# or install
pipx install ssoty
ssoty diff --a claude-code --b codex  # compare two named harnesses (read-only)
ssoty audit --redact            # mask home paths + emails in output
ssoty audit --ci                # exit non-zero on any Critical (for CI)
ssoty audit --format sarif      # SARIF 2.1.0 (for github/codeql-action/upload-sarif)

--format {text,json,sarif} selects the audit output (default text); --json is a back-compat alias for --format json.

Fix (dry-run + backup first)

ssoty fix                       # DRY-RUN: prints what WOULD change, writes nothing
ssoty fix --apply               # perform safe fixes; backs every touched file up first
ssoty fix --apply --scaffold-ignore   # also append non-shared rule names to .ssotyignore

ssoty fix is dry-run by default — it prints exactly what it would do and changes nothing. Only --apply writes, and even then it first copies every file it will touch into a timestamped backup dir under the audited root (.ssoty-backup/<timestamp>/, path-preserving) and prints that location. It performs only safe remediations: removing a broken symlink (its target does not resolve, so no real content is lost) and, with --scaffold-ignore, recording intentionally non-shared rule names in .ssotyignore. It never edits your real rule files, never touches a valid symlink, and is idempotent (running it again does nothing). Add .ssoty-backup/ to your gitignore so backups are never committed.

CI (GitHub Action)

- uses: snowlaxc/ssoty@v0
  with: { path: . }             # runs `ssoty audit --ci`

Harness adapters (optional)

Thin wrappers so you can run ssoty from inside an agent:

  • Claude Code: copy adapters/claude-code/skills/ssoty into ~/.claude/skills/
  • Codex: copy adapters/codex/skills/ssoty into ~/.codex/skills/

The CLI is the product; adapters just shell out to it.

How it works

ssoty resolves each harness's effective rule surface from disk (which files load, and whether always-on or skill-gated), then runs deterministic checks. No model calls, no network — same input, same output. It is harness-agnostic by design: a cross-harness tool shouldn't live inside one harness.

Supported harnesses

Claude Code (~/.claude/rules, CLAUDE.md), Codex (AGENTS.md, global-agent-rules), Cursor (.cursor/rules/*.mdc with alwaysApply frontmatter, legacy .cursorrules), GitHub Copilot (.github/copilot-instructions.md), Gemini CLI (GEMINI.md, ~/.gemini/GEMINI.md), Cline (.clinerules/ directory, legacy .clinerules, AGENTS.md), Windsurf (.windsurf/rules/*.md, legacy .windsurfrules), and Continue (.continue/rules/*.md). Empty harnesses are skipped. Point ssoty at $HOME or a project root.

Privacy

ssoty audits your config; its output can quote your rules verbatim. It runs entirely locally (no hosted service). This repo ships synthetic fixtures only. See SECURITY.md. Never commit ssoty output to a public repo.

Roadmap (phase 2)

ssoty fix (auto-dedup), opt-in live "canary" runtime probe, LLM semantic conflict detection, Gemini support, marketplace packaging.

Background

The design rationale lives in docs/RFC.md.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ssoty-0.1.8.tar.gz (104.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ssoty-0.1.8-py3-none-any.whl (28.1 kB view details)

Uploaded Python 3

File details

Details for the file ssoty-0.1.8.tar.gz.

File metadata

  • Download URL: ssoty-0.1.8.tar.gz
  • Upload date:
  • Size: 104.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ssoty-0.1.8.tar.gz
Algorithm Hash digest
SHA256 767c923a836754e13ceb8967c9bab9211f1fac16a3ae5a33d4c6ab0304603514
MD5 b91758b75b19d829ac143e5ed97c1f96
BLAKE2b-256 ec118fde4833fd1eb4cb141610823e84a6dbb047c58bfed9a99332c2d833dd29

See more details on using hashes here.

Provenance

The following attestation bundles were made for ssoty-0.1.8.tar.gz:

Publisher: release.yml on snowlaxc/ssoty

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ssoty-0.1.8-py3-none-any.whl.

File metadata

  • Download URL: ssoty-0.1.8-py3-none-any.whl
  • Upload date:
  • Size: 28.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ssoty-0.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 841aa030ee17f53b885c1e066f9209a74d4f98f9f28e735de0df7f18557c22d5
MD5 c87e4b2c944d89e9cbf57af9ceae46a5
BLAKE2b-256 302a6e0648a4daacd75536ec02106e3f7a64e1add52ca39e3b1c5fab5687e93c

See more details on using hashes here.

Provenance

The following attestation bundles were made for ssoty-0.1.8-py3-none-any.whl:

Publisher: release.yml on snowlaxc/ssoty

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page