Skip to main content

A black-box recorder for AI coding agents and shell commands.

Project description

TraceForge

See exactly what your AI coding agent changed, printed, and broke.

TraceForge is a local-first black-box recorder for AI coding agents and shell commands. It captures the command, stdout, stderr, exit code, duration, Git diff, changed files, timeline events, risk findings, run comparisons, HTML reports, and redacted JSON exports, then lets you inspect everything in a localhost dashboard.

Use it when you want an audit trail for Claude Code, Codex, Aider, opencode, or any command that can change a repository.

version python license status

pip install traceforge-ai
traceforge run --live -- python test_calculator.py
traceforge dashboard

TraceForge demo


Why TraceForge?

AI coding agents are powerful, but they are still hard to audit:

  • What command did the agent run?
  • What exactly changed in the repository?
  • Which output or error happened before the patch?
  • Did the run touch dependency files, CI workflows, or sensitive-looking files?
  • Can two agent attempts be compared side by side?
  • Can a run be exported and attached to an issue or code review?

TraceForge gives every run a local, reviewable trace.

The workflow

Record an agent or command:

traceforge run --live -- python test_calculator.py

Open the dashboard:

traceforge dashboard

Compare attempts or export a safe trace:

traceforge compare <run_a> <run_b>
traceforge export <run_id> --redact --out trace.redacted.json

The dashboard shows run metrics, searchable run history, command details, timeline events, risk report, stdout, stderr, patch diff, changed files, compare view, and full JSON.

TraceForge dashboard

Typical use cases

  • Review an AI coding agent run: see the exact command, output, file changes, patch, and risk signals.
  • Compare two fixes: run Claude/Codex/Aider twice and compare duration, exit code, files touched, and patch size.
  • Share a reproducible trace: export JSON with --redact before attaching it to an issue, PR, or bug report.
  • Debug flaky repair attempts: keep each run as a local timeline instead of relying on terminal scrollback.
  • Audit risky edits: flag changes to dependency files, CI workflows, sensitive-looking paths, and possible secrets.

30-second demo

From any machine with Python 3.10+ and Git:

python -m pip install traceforge-ai
traceforge demo
cd traceforge-demo-project
traceforge run --live -- python test_calculator.py
traceforge dashboard

You should see one recorded run with stdout, exit code, timeline events, changed files, risk findings, and a replayable patch view.

When to use it

Use TraceForge when:

  • you are letting an AI coding agent modify a repository
  • you want a local audit trail for command runs
  • you need to compare two repair attempts
  • you want to export a reproducible run report for review

Do not use TraceForge as:

  • a sandbox
  • a secret scanner replacement
  • a full CI system
  • a child-process monitor for every process an agent may spawn

Git shows what changed. Terminal logs show what printed. TraceForge links the command, output, timeline, risk findings, and patch into one replayable local trace.

Install

Install TraceForge from PyPI:

python -m pip install traceforge-ai
traceforge version

Or clone the repository and install it in editable mode:

git clone https://github.com/zhangyeS12/traceforge.git
cd traceforge
python -m pip install -e .
traceforge version

Requirements:

  • Python 3.10+
  • Git
  • A Git repository for meaningful diff capture

Optional tools checked by traceforge doctor:

  • Node / npm
  • Rust / Cargo

Quick start

Initialize TraceForge in a Git project:

traceforge init
traceforge doctor

Create a simple script:

python -c "open('hello.py', 'w').write('print(\"hello from traceforge\")\n')"

Record a command:

traceforge run --live -- python hello.py

Open the dashboard:

traceforge dashboard

On Windows PowerShell:

Set-Content hello.py 'print("hello from traceforge")'
traceforge run --live -- python hello.py
traceforge dashboard

Browser workflow

Start the local dashboard:

traceforge dashboard

Then type a command directly in the browser, for example:

python modify_hello.py

TraceForge will run it locally, record stdout/stderr, capture Git diff, refresh the run list, and open the new run detail automatically.

The dashboard runs on localhost by default:

http://127.0.0.1:8787

Core features

  • CLI recorder: traceforge run -- <command> records one command run.
  • Dashboard runner: run commands directly from the browser dashboard.
  • Agent adapters: wrap shell, codex, claude, aider, opencode, or custom agent commands with traceforge agent run.
  • Replayable timeline: command start, stdout/stderr chunks, process exit, Git snapshots, diff capture, file changes, report generation, and run completion.
  • Run comparison: compare two runs by exit code, duration, changed files, event count, patch size, and file overlap.
  • Security risk report: scan risky commands, sensitive-looking files, dependency/CI changes, broad file changes, traceback output, and possible secret material.
  • Live output: --live streams stdout/stderr while still recording artifacts.
  • Run-attributed Git diff capture: records files changed by the current run, filters out pre-existing unchanged dirty files, and includes untracked new file contents in the patch.
  • Local SQLite store: traces live under .traceforge/ inside your project.
  • HTML reports: self-contained report pages for each run.
  • JSON export: traceforge export <run_id> creates machine-readable traces; --redact masks common secrets and local user paths before sharing.
  • Doctor checks: traceforge doctor checks Python, Git, optional toolchains, workspace, and database.
  • Selftest: traceforge selftest creates a temporary Git repo and verifies the full record -> diff -> report -> JSON -> compare -> risk flow.
  • Version guard: traceforge version-check validates version metadata; traceforge reindex rebuilds the local run index from existing run artifacts.
  • Release checks: traceforge release-check validates local source trees and release zip layout.

CLI reference

traceforge init
traceforge doctor [--json]
traceforge run [--live] [--shell] [--no-propagate-exit] -- <command>
traceforge list [--limit 20]
traceforge show <run_id>
traceforge timeline <run_id> [--json]
traceforge compare <run_a> <run_b> [--json]
traceforge risk <run_id> [--json]
traceforge agent list
traceforge agent doctor
traceforge agent run <adapter> [--live] [--preview] -- <task-or-command>
traceforge report <run_id>
traceforge open [run_id]
traceforge dashboard [--host 127.0.0.1] [--port 8787] [--no-open]
traceforge diff <run_a> <run_b>  # alias for compare
traceforge export <run_id> [--out trace.json] [--redact]
traceforge clean [--yes] [--all]
traceforge selftest [--json]
traceforge version-check
traceforge reindex [--json]
traceforge release-check [--zip path] [--json]
traceforge demo [path]
traceforge version

Shellless vs shell mode

By default, TraceForge runs commands without a shell:

traceforge run -- python hello.py

This is safer and avoids many Windows quoting issues.

Use shell mode only when you need shell syntax such as &&, pipes, or redirects:

traceforge run --shell -- "npm test && npm run lint"

In the dashboard, enable the shell checkbox for the same behavior.

Agent adapters

List available adapters:

traceforge agent list
traceforge agent doctor

Run a passthrough command through the adapter layer:

traceforge agent run shell -- python modify_hello.py

Preview an agent command without running it:

traceforge agent run codex --preview -- "fix the failing tests"

TraceForge keeps adapters thin: the adapter builds the local command, and the core recorder still captures stdout, stderr, Git diff, timeline, compare, and risk report.

See docs/agent-recipes.md for Codex, Claude Code, Aider, opencode, shell, and custom adapter examples.

Run comparison

Compare two attempts:

traceforge compare <run_a> <run_b>

TraceForge compares:

  • exit code
  • duration
  • event count
  • changed file count
  • patch size
  • common files
  • files only changed in one run
  • status changes by file

Risk report

Generate a security-oriented run report:

traceforge risk <run_id>

The risk report looks for:

  • risky command substrings
  • sensitive-looking file paths
  • dependency or package-management file changes
  • CI workflow changes
  • broad file changes
  • possible secret material in patches
  • traceback output in stderr

Sharing traces safely

TraceForge is local-first, but exported traces can still contain private source, terminal output, local usernames, tokens, or API keys. Use redacted export before attaching a trace to an issue or review:

traceforge export <run_id> --redact --out trace.redacted.json

Redaction is a best-effort sharing aid, not a guarantee. Review exported files before posting them publicly.

Local data layout

TraceForge writes local data to .traceforge/:

.traceforge/
  config.json
  traceforge.db
  runs/
    <run_id>/
      stdout.txt
      stderr.txt
      patch.diff
      trace.json
  reports/
    index.html
    latest.html
    <run_id>.html

.traceforge/ should stay out of Git. The default .gitignore includes it.

Architecture

CLI / dashboard command
        -> security pre-check
        -> Git snapshot before
        -> subprocess execution
        -> stdout/stderr capture
        -> Git snapshot after + patch diff
        -> SQLite trace storage
        -> timeline events + risk assessment
        -> HTML report / JSON export / dashboard API

Public-readiness checks

Before publishing or debugging a user environment:

traceforge doctor
traceforge version-check
traceforge reindex
traceforge selftest
traceforge release-check

Before sharing a zip release:

traceforge release-check --zip traceforge_v1_3_2.zip

Roadmap

Near term

  • Side-by-side diff viewer.
  • Richer compare views for timeline differences and artifact differences.
  • Configurable security rules and risk thresholds.
  • Test-result parsers for pytest, npm, Jest, and cargo test.
  • Run tags, notes, and search improvements.
  • Timeline filtering and event detail drawers.
  • Better dashboard empty states and error recovery.

Agent-focused roadmap

  • More robust adapters for Codex, Claude Code, Cline, OpenHands, and custom agent wrappers.
  • MCP tool-call recording.
  • Prompt-injection and sensitive-file audit layer.
  • Docker sandbox execution.
  • Network allow/deny policy.
  • Run replay, agent adapters, and benchmark-style comparison reports.

Long-term vision

TraceForge should become the local observability layer for agentic coding: every command, file change, tool call, test failure, and generated patch should be reproducible, inspectable, and safe to share.

Resume description

Built TraceForge, a local black-box recorder for AI coding agents and shell commands. Implemented a CLI and browser dashboard that capture subprocess output, Git patches, changed files, fine-grained timeline events, run comparison, runtime metadata, security findings, SQLite traces, HTML replay reports, JSON exports, selftests, release checks, and GitHub CI.

Contributing

Good first contributions:

  • Add a test-result parser.
  • Improve HTML diff rendering.
  • Add support for tags and run notes.
  • Add Docker sandbox execution.
  • Add adapters for common coding agents.

See CONTRIBUTING.md for development setup.

Security

See SECURITY.md for supported versions, reporting guidance, and current security limitations.

License

MIT.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

traceforge_ai-1.3.2.tar.gz (52.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

traceforge_ai-1.3.2-py3-none-any.whl (51.7 kB view details)

Uploaded Python 3

File details

Details for the file traceforge_ai-1.3.2.tar.gz.

File metadata

  • Download URL: traceforge_ai-1.3.2.tar.gz
  • Upload date:
  • Size: 52.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for traceforge_ai-1.3.2.tar.gz
Algorithm Hash digest
SHA256 07994c1761202f69d63bcadb128c3a7b20ca3a80bb05f0e3f1e4754fd563610f
MD5 c5bbee220ea0e35078b8bd2f50f1eeb2
BLAKE2b-256 5aa42d0959d9c5611896d203dba8418b2b165c69f230d01d452e426ac67adf00

See more details on using hashes here.

Provenance

The following attestation bundles were made for traceforge_ai-1.3.2.tar.gz:

Publisher: publish.yml on zhangyeS12/traceforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file traceforge_ai-1.3.2-py3-none-any.whl.

File metadata

  • Download URL: traceforge_ai-1.3.2-py3-none-any.whl
  • Upload date:
  • Size: 51.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for traceforge_ai-1.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 d74b9f56975d804a97c49358f2eba617eb05158a18075c44bd4008f9e7858413
MD5 6bb479c62194288ce20140621e08bafa
BLAKE2b-256 1d2d77eec28e0d94fcc773a355ea4cac38d76395e64a0098abaafa14b6470789

See more details on using hashes here.

Provenance

The following attestation bundles were made for traceforge_ai-1.3.2-py3-none-any.whl:

Publisher: publish.yml on zhangyeS12/traceforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page