Skip to main content

Sound behavior-equivalence verification for refactors, using your own tests for inputs

Project description

Selfsame

CI PyPI Python License: MIT

Know whether your code still behaves the same — before, after, and across every AI edit.

Selfsame is a sound behavior checker for Python. It captures the real arguments your tests (or app) feed your code, replays two versions in isolated subprocesses, and compares the results structurally. Use it to prove a refactor didn't change behavior — or to catch the silent regressions that creep in when an AI agent ships features all day and "a new feature works, but the old ones quietly broke."

The one promise: zero false confidence

Selfsame never says equivalent when behavior actually differs, and never says divergent when it doesn't. When it can't be sure, it refuses (unverifiable) instead of guessing. A green result means green.

  • 🧪 Inputs are real, not generated — recorded from your own test suite or app run. No type hints required; methods, packages, and relative imports just work.
  • 🔒 Sound by construction — uncontrolled I/O, threads, nondeterminism, and opaque values are refused, never certified.
  • 🤖 Built for AI-driven development — freeze an accepted build, then measure how far each generated change drifts from it. No second git branch needed.
  • 📄 Agent-consumable reports — every run drops .selfsame/report.json + Markdown with file:line, before→after witnesses, and what was not covered.
  • 🪶 Pure standard library — no runtime dependencies. pip install and go.

Install

pip install selfsame        # or: pipx install selfsame · uv tool install selfsame

Installs the selfsame command (probe is a kept alias). Python 3.8+.

60-second start

Did my refactor change behavior? (inputs come from your existing tests)

selfsame verify --base main --modules mypkg -- pytest -q
  parse_args                     n=11   equivalent
X slugify                        n=102  divergent     @ input #0
      input : ('Café', max_length=3)
      base  : 'caf'
      head  : 'caf-'
      minimized: ('ab', max_length=1)

Sound auto-verify : 3/4 = 100%
  ** 1 DIVERGENCE(S): behavior changed at a tested input **
selfsame: 3 equivalent · 1 divergent · 0 unverifiable  →  .selfsame/report.json

Exit code is non-zero on any divergence, so drop it straight into CI.

The AI use case: catch regressions against a confirmed build

When an AI agent generates code continuously, you rarely have a clean "before" branch — you have a build you accepted and whatever the next feature did to it. Freeze the accepted behavior once, then check drift after every change:

# 1. You confirm a build works. Freeze its behavior as the baseline.
selfsame snapshot --modules myapp -- pytest -q

# 2. The agent develops the next feature (adds code, edits existing code)...

# 3. How much of the accepted behavior changed?
selfsame drift          # exit 1 if anything deviated → blocks the bad build

A worked example — the agent adds a feature and accidentally breaks an existing function:

~ discount          n=2   interface-change   (added optional param 'currency' — back-compatible)
X greet             n=1   divergent          base 'Hello, Sam!'  →  head 'Hi, Sam'     ← regression caught
  total             n=1   equivalent         (rewritten as a loop — behavior preserved)
# new_helper: flagged separately as changed code with no test baseline

The signal scales with behavior that actually changed, not lines of code: brand-new code has no baseline (no noise), behavior-preserving rewrites stay equivalent, and only real deviations at tested inputs are flagged. Make it automatic — pytest becomes your regression gate:

# pyproject.toml  ·  [tool.pytest.ini_options]  (or pytest.ini)
[pytest]
selfsame = true     # the plugin runs a compare-only drift check after the suite

The plugin is compare-only: it never re-baselines on its own, so a regression can't silently become the new "correct" behavior — you bless a new baseline explicitly with selfsame snapshot.

👉 Full walkthrough: docs/ai-workflows.md

Commands at a glance

command what it does
selfsame verify replay base vs head on your test inputs; per-function verdict + CI exit code
selfsame snapshot freeze the current (accepted) build's behavior to a baseline file
selfsame drift measure how much current code deviated from the baseline (no second branch)
selfsame capture record real call arguments from any test or app command
selfsame replay replay captured arguments across two git refs
selfsame attach dump captures from a running, hook-enabled process without stopping it
selfsame check generate inputs and check two files / git refs (for typed, pure functions)
selfsame fuzz (experimental) mutate real inputs to find divergences your tests miss

Full reference with every flag: docs/commands.md.

Documentation

🚀 Getting started install, your first verify, your first snapshot/drift
🤖 AI workflows snapshot/drift, the pytest plugin, agent reports, working at LLM velocity
📖 Command reference every command and flag
⚙️ Configuration [tool.selfsame], environment variables, exit codes
🛠️ How it works capture → replay → compare, and the soundness model
🧭 Limitations the honest boundaries — read before you rely on it

Project

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

selfsame-0.2.0.tar.gz (85.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

selfsame-0.2.0-py3-none-any.whl (79.8 kB view details)

Uploaded Python 3

File details

Details for the file selfsame-0.2.0.tar.gz.

File metadata

  • Download URL: selfsame-0.2.0.tar.gz
  • Upload date:
  • Size: 85.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for selfsame-0.2.0.tar.gz
Algorithm Hash digest
SHA256 53393168a6d56b838c78552725c3244f5c096ca32b3456e3250403069fdb8391
MD5 040068768999042ecc9c7bad186aec81
BLAKE2b-256 1a3bf9c226c2d24cdc1539009a37efbbd9f53ab348d1da2b5bea2af55ce90ffc

See more details on using hashes here.

Provenance

The following attestation bundles were made for selfsame-0.2.0.tar.gz:

Publisher: release.yml on PraveenKPandu/Selfsame

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file selfsame-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: selfsame-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 79.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for selfsame-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8d5996ba8bd27d1938300aad9e8507b3d445d96e3a8869e306dff5baaf0c8d49
MD5 be5576bb95c247453c927586a8d756bf
BLAKE2b-256 d786c196438e1acdf231a55f5b9be8bd78916422bc41de9f494b5643c515665f

See more details on using hashes here.

Provenance

The following attestation bundles were made for selfsame-0.2.0-py3-none-any.whl:

Publisher: release.yml on PraveenKPandu/Selfsame

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page