Skip to main content

Autonomous reverse-engineering agent with a source-aware reverser/checker loop, objective verification, parity checks, and a Ghidra backend

Project description

re-agent

Autonomous reverse-engineering agent — source-aware reverser/checker loop, objective verifier, parity engine, and Ghidra backend.

Overview

Demo: YouTube

re-agent automates a reverse-engineering workflow by combining a reverser/checker loop with Ghidra decompilation through ghidra-ai-bridge. The current pipeline also retrieves nearby project source context during generation and runs a conservative structural verifier before accepting checker passes.

re-agent reverse --class CTrain
    │
    ├── Config (re-agent.yaml + env + CLI)
    │   └── project_profile (stub_markers, hook_patterns, source_layout)
    │
    ├── Orchestrator (single / class runner)
    │   ├── Function Picker (ranks by caller count, filters completed)
    │   ├── Context Gatherer (decompile + xrefs + structs + source retrieval)
    │   │
    │   ├── Agent Loop (reverser → checker → fix, max N rounds)
    │   │   ├── LLM Providers: Claude | OpenAI-compatible APIs | Codex CLI
    │   │   └── Prompt Templates (customizable .md files)
    │   │
    │   ├── Objective Verifier (call-count + control-flow sanity checks)
    │   │
    │   ├── Parity Engine (GREEN/YELLOW/RED verification gate)
    │   │   ├── Source Indexer (C++ body parser)
    │   │   ├── 11 Heuristic Signals (all configurable/toggleable)
    │   │   └── Semantic Rules + Manual Approvals
    │   │
    │   └── Session State (JSON progress file)
    │
    └── RE Backend: ghidra-ai-bridge
        └── Capability flags → graceful degradation

Requirements

  • Python 3.10+
  • ghidra-ai-bridge — re-agent uses this as its backend to decompile functions, fetch xrefs, read structs/enums, and query Ghidra. Install it and point it at your Ghidra project before running re-agent reverse.
  • One supported LLM setup:
    • ANTHROPIC_API_KEY for Claude
    • OPENAI_API_KEY for OpenAI-compatible APIs
    • a local codex CLI login for the Codex provider

Installation

pip install re-agent

Quick Start

# 1. Initialize project config
re-agent init

# 2. Edit re-agent.yaml with your project settings

# 3. Reverse a single function
re-agent reverse --address 0x6F86A0

# 4. Reverse all functions in a class
re-agent reverse --class CTrain --max-functions 10

# 5. Run parity checks
re-agent parity --address 0x6F86A0

# 6. Check progress
re-agent status

Configuration

re-agent uses a layered configuration system (highest priority first): CLI flags > environment variables (RE_AGENT_*) > re-agent.yaml > defaults.

llm:
  provider: claude           # claude | openai | openai-compat | codex
  model: claude-sonnet-4-5-20250929
  # api_key: set via RE_AGENT_LLM_API_KEY env var
  timeout_s: 1800

backend:
  type: ghidra-bridge
  cli_path: ~/ghidra-tools/ghidra

orchestrator:
  max_review_rounds: 4
  max_functions_per_class: 10
  objective_verifier_enabled: true

project_profile:
  source_root: ./source/game_sa
  hook_patterns:
    - 'RH_ScopedInstall\s*\(\s*(\w+)\s*,\s*(0x[0-9A-Fa-f]+)'
  stub_markers: ["NOTSA_UNREACHABLE"]
  stub_call_prefix: "plugin::Call"

See docs/configuration.md for all options.

CLI Reference

Command Description
re-agent init Generate re-agent.yaml config file
re-agent reverse --address ADDR Reverse a single function
re-agent reverse --class CLASS Reverse all functions in a class
re-agent reverse --dry-run Show what would be reversed
re-agent parity --address ADDR Run parity checks on a function
re-agent parity --filter REGEX Run parity checks matching pattern
re-agent status Show reversal progress
re-agent status --class CLASS Show progress for a specific class

LLM Providers

  • Claude (Anthropic SDK) — set ANTHROPIC_API_KEY
  • OpenAI / OpenAI-compatible — set OPENAI_API_KEY, optionally set base_url
  • Codex CLI — uses local codex exec with ChatGPT login credentials; no API key required

Parity Engine

The parity engine runs 11 configurable heuristic signals to verify reversed code matches the original binary:

Signal Level Description
Missing source RED No source body found for hooked function
Stub markers RED Source contains stub markers (e.g., NOTSA_UNREACHABLE)
Trivial stub RED Plugin-call heavy with tiny body and no control flow
Large ASM tiny source RED ASM >= 80 instructions but source <= 12 lines
Plugin-call heavy YELLOW Plugin calls dominate the function body
Short body YELLOW Body has fewer than 6 lines
Low call count YELLOW Decompile shows many callees but source has few
FP sensitivity YELLOW ASM has floating-point ops but source doesn't
Call count mismatch YELLOW Source call count differs significantly from ASM
NaN logic YELLOW Decompile has NaN handling but source doesn't
Inline wrapper INFO Function is a thin inline wrapper

Objective Verifier

The reversal loop also runs a conservative structural verifier after the LLM checker passes. It only blocks acceptance on strong mismatches such as:

  • call-count gaps between candidate code and decompile/ASM
  • control-flow gaps where the candidate is clearly missing branches or loops

This is intentionally narrower than full equivalence checking, but it catches obvious false positives before they are recorded as successful reversals.

This matters in practice because an LLM checker can still false-positive on code that looks plausible while missing real branch or call structure from the binary.

Safety

  • No auto-commit: re-agent writes code but never commits or pushes
  • Bounded retries: Hard cap on fix loop iterations (default: 4)
  • Deterministic logs: Every LLM call logged with timestamps
  • No destructive ops: Never deletes files, modifies git, or runs builds
  • Session isolation: Progress appended, never overwritten

Development

git clone https://github.com/dryxio/auto-re-agent.git
cd auto-re-agent
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"

pytest tests/
ruff check src/
mypy src/re_agent/

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

auto_re_agent-0.1.0.tar.gz (52.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

auto_re_agent-0.1.0-py3-none-any.whl (64.3 kB view details)

Uploaded Python 3

File details

Details for the file auto_re_agent-0.1.0.tar.gz.

File metadata

  • Download URL: auto_re_agent-0.1.0.tar.gz
  • Upload date:
  • Size: 52.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for auto_re_agent-0.1.0.tar.gz
Algorithm Hash digest
SHA256 57ec2bfb2a0aa424e7eeed7f447c3c885babe3585e9914d14d085f764a843d2d
MD5 dbd8a1a2d6db83151d1245e5940634bb
BLAKE2b-256 1d5c8991341214e2b1be797ee18313b3c458d7fe32737c3e1836b121b09290bd

See more details on using hashes here.

File details

Details for the file auto_re_agent-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: auto_re_agent-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 64.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for auto_re_agent-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bddc7ace083d14f7053902e254802ebbe7bcaaa52d82e6871c84bf1cfa5b5a11
MD5 6b44302b28a790cdb0ef25f7299c15e1
BLAKE2b-256 05317e42aded206b5109ae8f498159d2fc261a6a545fcee9d5a9969b3977f559

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page