Skip to main content

Enforce test immutability for agentic TDD workflows

Project description

testwall

Enforce test immutability for agentic TDD workflows.

LLM coding agents routinely cheat test gates — weakening assertions, deleting failing tests, modifying config, or special-casing inputs. Research (ImpossibleBench, arxiv 2510.20270) shows frontier models exploit test cases 76% of the time when given write access, but cheating drops to near zero when tests are read-only. testwall enforces that boundary.

How it works

testwall init       # snapshot test files + compute SHA-256 checksums
testwall lock       # chmod 444 — agent can read but not modify
testwall run        # restore from snapshot, then execute tests
testwall verify     # check checksums — exit 1 on any mismatch
testwall accept     # verify + unlock + clean up snapshot

Even if an agent bypasses file permissions, testwall run restores the original tests from snapshot before executing them. testwall verify catches any tampering at the checksum level.

Install

# Rust
cargo install testwall

# Python
pip install testwall

# Node
npm install -g testwall

Quick start

# 1. Initialize — snapshots all test files matching default patterns
testwall init

# 2. Lock test files before handing off to an implementing agent
testwall lock

# 3. Agent implements... then run tests against the immutable snapshot
testwall run

# 4. If tests pass and nothing was tampered with, accept the result
testwall accept

Commands

testwall init [-p PATTERN...] [-c CMD]

Scan for test files, compute checksums, and store snapshots in .testwall/.

Without -p, uses built-in patterns for Python, Rust, JavaScript/TypeScript, Go, Java, and Kotlin — plus common config files like pytest.ini, jest.config.*, and .cargo/config.toml.

testwall init                              # auto-detect
testwall init -p "tests/**/*.py" -p "conftest.py"  # explicit patterns
testwall init -c "pytest -x"              # record the test command

testwall lock

Set all snapshotted test files to read-only (chmod 444).

testwall unlock

Restore write permissions on test files.

testwall run [-c CMD] [-- extra args]

Restore test files from snapshot, then execute the test runner. This is the tamper-proof execution path — even if the agent modified the working copies, the originals run.

testwall run                    # use command from init or auto-detect
testwall run -c "pytest"        # override test command
testwall run -- -x --no-header  # forward args to test runner

testwall verify [--report-only]

Compare current test file checksums against the manifest. Exits with code 1 if any file was modified or deleted.

testwall verify                 # fail on mismatch
testwall verify --report-only   # print report, always exit 0

testwall accept

The merge gate. Runs verification, then unlocks files and cleans up the snapshot directory. Rejects if any tampering is detected.

testwall status

Show the current manifest: file count, lock state, snapshot presence, patterns, and test command.

Default patterns

testwall ships with patterns for common test conventions:

Ecosystem Patterns
Python test_*.py, *_test.py, tests/**/*.py, conftest.py
Rust tests/**/*.rs
JS/TS **/*.test.{js,ts,tsx}, **/*.spec.{js,ts,tsx}
Go **/*_test.go
Java/Kotlin src/test/**/*.java, src/test/**/*.kt
Config pytest.ini, setup.cfg, jest.config.*, vitest.config.*, .cargo/config.toml

Typical workflow

  You (test author)          testwall            Agent (implementer)
  ─────────────────          ────────            ───────────────────
  Write tests
          ├──── testwall init ────►
          ├──── testwall lock ────►
          │                              Agent implements code
          │                              Agent tries to edit tests → DENIED
          │                              Agent runs testwall run
          │                        ◄──── tests execute from snapshot
          │                              Tests pass
          ├──── testwall accept ──►
          │     ✓ checksums match
          │     ✓ files unlocked
          │     ✓ snapshot cleaned

What it catches

  • Weakened assertions (assert x > 0assert True)
  • Deleted test cases
  • Modified test config (conftest.py, jest.config.*)
  • Special-cased test inputs
  • Swapped test runner flags
  • Any byte-level change to snapshotted files

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

testwall-1.0.1.tar.gz (21.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

testwall-1.0.1-py3-none-macosx_11_0_arm64.whl (528.9 kB view details)

Uploaded Python 3macOS 11.0+ ARM64

File details

Details for the file testwall-1.0.1.tar.gz.

File metadata

  • Download URL: testwall-1.0.1.tar.gz
  • Upload date:
  • Size: 21.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for testwall-1.0.1.tar.gz
Algorithm Hash digest
SHA256 575311db96b1a807967cb01b42bae68f8df893fc1692846a41acf1dae160ec51
MD5 341bdbe878ce658ae27a2e6c9b221b47
BLAKE2b-256 6aef6121b0c3b675fa02e03ddc8064908b585aa49373ea26fe4551dc2a02836b

See more details on using hashes here.

File details

Details for the file testwall-1.0.1-py3-none-macosx_11_0_arm64.whl.

File metadata

  • Download URL: testwall-1.0.1-py3-none-macosx_11_0_arm64.whl
  • Upload date:
  • Size: 528.9 kB
  • Tags: Python 3, macOS 11.0+ ARM64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for testwall-1.0.1-py3-none-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4a10d9be9ed594eae2150230244a5f55901ad21f92c3c332c2c41a5f7e53b6db
MD5 c55f1c3831d895bcc313af13145c4c0d
BLAKE2b-256 3f1f24838917148aac46aba2da03f787c081e11b9c258250851ac8e1bbd936e4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page