Skip to main content

Reproduce Sentry crashes as failing pytest tests — sandbox execution, verified evidence

Project description

logomesh

Paste a Sentry URL. Get a failing pytest back.

pip install logomesh
logomesh repro https://sentry.io/organizations/your-org/issues/12345678/

What it does

Takes the innermost in-app frame from a Sentry crash, grabs whatever locals Sentry captured at crash time, builds a pytest that calls that function with those exact values, runs it in a Docker sandbox, and tells you if it still reproduces on your current branch.

The test is synthesized deterministically — no LLM touches the test bytes. LLM reasoning is used for context recovery and strategy (advisory only, never in the evidence path).

If the sandbox raises the same exception type Sentry captured → reproduced, you get the test. If not → explicit refusal with a structured reason.


Requirements

  • Python 3.11+
  • Docker running locally
  • A Sentry auth token with event:read scope (Settings → API Keys)
  • An OpenAI API key (used for advisory context recovery, not test synthesis)
export SENTRY_AUTH_TOKEN=sntryu_...
export OPENAI_API_KEY=sk-...

Or drop them in a .env file in your project root.


Usage

# reproduce a crash
logomesh repro https://sentry.io/organizations/your-org/issues/12345678/

# skip LLM entirely — deterministic frame-locals replay only
logomesh repro <url> --no-llm

# emit a sealed audit artifact (SOC2 CC7.3 / PCI DSS 12.10.5)
logomesh repro <url> --artifact

# open a GitHub draft PR with the failing test attached
logomesh repro <url> --draft-pr

# machine-readable JSON output
logomesh repro <url> --json

# point at a local repo (default: cwd)
logomesh repro <url> --repo /path/to/repo

# set wall-clock timeout (default: 60s)
logomesh repro <url> --timeout 120

Supported Sentry URL formats:

https://sentry.io/organizations/{org}/issues/{id}/
https://sentry.io/issues/{id}/
https://{org}.sentry.io/issues/{id}/

Example output

Reproduced:

  ✓ Reproduced: ZeroDivisionError at billing/calc.py:18
     division by zero
     rate = total / count

Not reproduced:

  ✗ Cannot reproduce ValueError at checkout.py:42
     The synthesized test passed against the current branch.
     Either the bug is fixed, or the captured locals are insufficient.

How it works

The orchestrator is a LangGraph supervisor graph with 11 tools. Only deterministic_repro can produce the artifact and PR — the evidence path is contract-enforced.

The 11 tools:

Tool What it does
fetch_sentry_event Fetches event + frame locals from Sentry API. PII redaction runs here before anything else sees the data.
deterministic_repro Builds pytest from frame locals, runs it in Docker sandbox. Only tool that produces sealed evidence. Zero LLM.
critic_validate Scores fidelity (0.0–1.0). Checks same exception type, same function, locals match. Min 0.9 to ship. Up to 3 attempts.
context_reconstructor Called when repro falls short. Handles: no_repro, async_state, db_state, globals, c_ext, missing_fixture.
hypothesis_invariant_suggester Suggests Hypothesis property tests for the crashed function. Advisory only — never touches artifact.
web_search Searches PyPI / GitHub / StackOverflow / CVE / general. Used to recover source paths and dep advisories.
rag_search Searches codebase / past runs / docs / memory.
prepare_environment Builds a dependency snapshot. Pins exact prod versions from event.modules if available. CVE lookup per package.
introspect_repo RAG window into the repo — imports, decorators, class shapes, manifests, entrypoints. Helps the agent decide how to bootstrap the sandbox.
build_artifact Seals the artifact: SHA-256 stamp, sandbox image digest, llm_in_evidence_path: false attestation, SOC2/PCI control mapping.
create_draft_pr Opens a GitHub draft PR with the failing test attached.

Scientific context engine: When repro falls short, a deterministic Observe → Hypothesize → Experiment → Verify loop probes breadcrumbs, RAG, and PyPI for why the crash can't reproduce (DB state, async runtime, missing globals, dep drift). Never produces test code — only structured notes that direct the supervisor toward a different deterministic env state.

Source resolution: Fuzzy resolver tries absolute path, repo-relative, leading-component-strip, and rglob-basename matches — so prod paths like /app/src/billing/x.py find dev src/billing/x.py. On first failure, the supervisor searches for the path and retries with hints. After two attempts, it refuses to ship and flags for human review.

Verified exception match: The sandbox exception type must match the Sentry-captured exception type exactly. Anything else refuses to ship as evidence.


Docker sandbox

  • 128 MB RAM cap, 50% CPU, 50 PIDs
  • Airgapped (no network)
  • nobody user, read-only rootfs
  • 15s default per-test timeout
  • PYTHONHASHSEED retry: if a test passes but shouldn't, it retries with multiple seeds

PII redaction

Runs before any LLM call and before any byte lands in the artifact:

  • PAN (Luhn-validated credit card numbers)
  • SSN, email, JWT, API keys
  • Field-name scrubbing (e.g. password, token, secret, card_number)
  • Request headers and query strings

Audit trail

All LLM reasoning, tool calls, critic scores, and strategy outcomes are recorded in an AuditSession. The artifact preamble explicitly attests: llm_in_evidence_path: false. Control mapping on every artifact: SOC2-CC7.3, SOC2-CC7.4, PCI-DSS-4.0-12.10.5.


What reproduces well

  • Input validation bugs
  • NoneType mismatches
  • Decimal / type coercion errors
  • Off-by-one, ordering, idempotency issues
  • Anything where the inputs that crashed the call are captured in the Sentry frame

What doesn't

  • Race conditions (frame locals don't capture thread interleaving)
  • Bugs that depend on live DB rows or Redis state not in the frame
  • C extension crashes
  • Distributed failures spanning services
  • Timezone/DST edge cases (sandbox runs TZ=UTC)

When it can't reproduce cleanly, it says so with a structured reason. It never guesses.


Sentry setup

Frame locals need to be enabled:

Project Settings → SDK Setup → Enable "Send default PII" — or in your SDK config:

sentry_sdk.init(dsn="...", send_default_pii=True)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

logomesh-0.1.2.tar.gz (530.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

logomesh-0.1.2-py3-none-any.whl (156.9 kB view details)

Uploaded Python 3

File details

Details for the file logomesh-0.1.2.tar.gz.

File metadata

  • Download URL: logomesh-0.1.2.tar.gz
  • Upload date:
  • Size: 530.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for logomesh-0.1.2.tar.gz
Algorithm Hash digest
SHA256 11167a78bfe1b3d7e729c40a677c29e8bf5173582ff6d947366d01509cfb40ca
MD5 fc49715931a9e855eaf743000217aa9a
BLAKE2b-256 7d80bcca690fc4f3d8a490e84b5359275f1ead0252e090f924103d97fcc802c0

See more details on using hashes here.

File details

Details for the file logomesh-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: logomesh-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 156.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for logomesh-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 45ab47dcd9eafa2ebfd8c93aebf32111fde73c41233798451da0fc294bbee51a
MD5 19fdb198d6707666bd2abb54e1bc97b8
BLAKE2b-256 2819fbbb3a1277309c226b8d90aad7458ed3df71daf37b4a03133d98e17cf128

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page