A Graph-Based ROP Gadget Finder for every architecture

These details have not been verified by PyPI

Project links

Homepage

Environment
- Console
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3
Topic
- Security

Reason this release was yanked:

hella worst model

Project description

LCSAJdump

Universal Graph-Based Framework for Automated Gadget Discovery

LCSAJdump is a static analysis framework designed to discover Return-Oriented Programming (ROP) and Jump-Oriented Programming (JOP) gadgets. Unlike traditional scanners, LCSAJdump is architecture-agnostic and employs a graph-based approach to uncover vulnerabilities invisible to common linear tools.

Why LCSAJdump?

Common ROP scanners use a linear "sliding-window" approach over the binary's executable bytes. This method systematically fails to identify Shadow Gadgets: execution chains that traverse non-contiguous memory blocks connected by unconditional jumps or conditional branches.

LCSAJdump overcomes this limitation by reconstructing the Control-Flow Graph (CFG) through LCSAJ (Linear Code Sequence and Jump) analysis. By modeling the binary as a directed graph of basic blocks, the tool identifies:

Contiguous Gadgets: Standard linear sequences terminating in a control-flow transfer.
Shadow Gadgets (Non-Contiguous): Complex chains that bypass "bad bytes" (e.g., null bytes) by utilizing instructions that would otherwise be unreachable via linear scanning.

Key Features

Multi-Architecture Support: Native support for RISC-V (64GC), x86-64, and ARM64, easily extendable to other architectures via modular profiles.
Graph-Based Analysis: Segments the .text section into LCSAJ basic blocks and reconstructs flow relationships using NetworkX.
Rainbow BFS Algorithm: Proprietary backward Breadth-First Search starting from control-flow sinks. Now features a qualitative feedback loop (penalty_threshold darkness) and Hard-Cap limits to prevent state explosion and ensure ultra-fast analysis even on dense CISC binaries.
Lazy Graph Build: Graph construction retains only nodes reachable from gadget tails within --depth hops, drastically reducing memory and build time on large binaries (e.g., libc) while producing identical results.
Two-Stage Ranking Engine: Combines a hyper-fast heuristic baseline (Bayesian-optimized via Optuna) with a deep-learning LightGBM ML model that refines gadget quality using structural and semantic features.
Zero-Overhead Inference: The ML model is integrated natively and runs by default, processing tens of thousands of nodes in seconds. It acts as a highly effective filter, rejecting noisy jumps and returning clean, highly controllable gadget chains. Hosted on Hugging Face.
Pruning Parameters: Configurable "Darkness" factor to balance analysis depth and performance, preventing infinite loops in cyclic graphs.

Supported Architectures

(see Benchmarks).

LCSAJdump is designed to be universal. Currently supported:

RISC-V 64-bit (RV64GC): Full support for compressed 16-bit instructions.
x86-64: Handles variable-length overlapping instructions. Safely navigates dense graphs without memory explosion.
ARM64: Handles 32-bit instructions and deeply filters out bloated gadgets via strict heuristic penalties.
Other Architectures: Can be easily implemented by defining new profiles in config.py.

Installation

Via Pip (Recommended)

pip install lcsajdump

From Source (Development)

git clone [https://github.com/Chris1sFlaggin/LCSAJdump.git](https://github.com/Chris1sFlaggin/LCSAJdump.git)
cd LCSAJdump
pip install -r requirements.txt

Usage

LCSAJdump offers a powerful CLI for precise binary analysis:

Standard Analysis (Default RISC-V):

python LCSAJdump.py <path_to_binary>

Advanced Analysis (Specifying Architecture and Output File):

lcsajdump -a riscv64 -d 15 -k 10 -l 20 -o gadgets.txt <path_to_binary>

Export as JSON with bad-char filter:

lcsajdump -a x86_64 -d 20 -k 5 -b "000a0d" --json -o gadgets.json <path_to_binary>

Note: Use -o after --json to save JSON to file. Without --json, -o saves plain text.

Save plain text output:

lcsajdump -a riscv64 -d 15 -k 10 -l 20 -o gadgets.txt <path_to_binary>

Analyze all executable sections:

lcsajdump --all-exec -d 25 -k 10 -l 30 <path_to_binary>

Force strictly algorithmic ranking (bypass ML):

lcsajdump --algo <path_to_binary>

CLI Options

Flag	Type	Default	Description
`-a, --arch`	TEXT	`auto`	Target architecture (`auto`, `riscv64`, `x86_64`, `arm64`). Auto-detected from ELF header.
`-d, --depth`	INTEGER	`20`	Max search depth in LCSAJ blocks. Controls chain length.
`-k, --darkness`	INTEGER	`5`	Pruning threshold — max visits per node. Higher = more gadgets, slower scan.
`-l, --limit`	INTEGER	`10`	Max number of gadgets to display in the output.
`-s, --min-score`	INTEGER	`0`	Minimum heuristic score for a gadget to appear in results.
`-i, --instructions`	INTEGER	`15`	Max number of instructions contained in a single LCSAJ node.
`-v, --verbose`	FLAG	—	Enable verbose output for detailed per-gadget results.
`-o, --output`	PATH	—	Write output to file. Plain text by default; use with `--json` for JSON output.
`-b, --bad-chars`	TEXT	—	Hex bytes to filter from gadget addresses (e.g. `"000a0d"`).
`--json`	FLAG	—	Output gadgets as structured JSON. Combine with `-o` to save to file.
`--all-exec`	FLAG	—	Analyze all executable sections, not just `.text`.
`-al, --algo`	FLAG	—	Use strictly the algorithmic ranking (bypass ML).
`--version`	FLAG	—	Show the installed version and exit.
`--help`	FLAG	—	Show help message and exit.

Accuracy & Benchmarks

LCSAJdump is backed by a rigorous, incrementally validated test suite located in the benchmarkTests/ directory.

Through 15 major iterations of semantic feature engineering, the hybrid model has learned to discriminate gadgets based on actual memory side-effects (extracted via angr symbolic execution) rather than purely syntactic heuristics.

When evaluated on monolithic, real-world executables like libc.so.6, the engine achieves a mathematically near-perfect NDCG@1 = 0.8549 and NDCG@5 = 0.8374. The Two-Stage engine successfully prioritizes clean stack-popping sequences and ret2csu-like calls, while heavily penalizing crash-prone fixed-offset jumps that deceive traditional static scanners.

Developer & ML Guide

The repository is structured to support both end-users and ML researchers.

Production Engine: The core CLI seamlessly integrates the inference engine using models hosted on Hugging Face, requiring no manual model loading.
ML Pipeline: The lcsajdump/ml_study/ directory contains the complete pipeline used to train the models:
- build_dataset.py: Extracts structural and semantic features from a corpus of CTF binaries.
- train_model.py: Trains the LightGBM LambdaRank model and outputs the .pkl models.
- kfold_cv.py: Validates the dataset using K-Fold Cross Validation.

Contributing (Open for Forks!)

The framework is open to new implementations. To add a new architecture:

Fork the repository.
Open lcsajdump/core/config.py.
Add a new profile to the ARCH_PROFILES dictionary, defining jump mnemonics, return mnemonics, and registers for the desired architecture (e.g., x86_64).
Submit a Pull Request.

License

This project is released under the MIT license. See the LICENSE file for details.

Project Link

Visit the project web page: LCSAJdump web page

Made by Chris1sflaggin as a research project for Bachelor's Thesis.

Project details

These details have not been verified by PyPI

Project links

Homepage

Environment
- Console
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3
Topic
- Security

Release history Release notifications | RSS feed

This version

2.1.0 yanked

Apr 28, 2026

Reason this release was yanked:

hella worst model

2.0.4

May 5, 2026

2.0.3

May 5, 2026

2.0.2

May 1, 2026

2.0.1

Apr 27, 2026

2.0.0

Apr 21, 2026

1.2.3.1

Mar 28, 2026

1.2.3

Mar 26, 2026

1.2.2

Mar 23, 2026

1.2.0

Mar 20, 2026

1.1.3.1

Mar 18, 2026

1.1.3

Mar 17, 2026

1.1.2.3

Mar 11, 2026

1.1.2.2 yanked

Mar 10, 2026

Reason this release was yanked:

bug in uncontrolled argument

1.1.2.1

Mar 9, 2026

1.1.2

Mar 8, 2026

1.1.1

Feb 26, 2026

1.1.1b0 pre-release

Feb 22, 2026

1.1.0

Feb 16, 2026

1.0.3

Feb 16, 2026

1.0.1

Feb 15, 2026

1.0.0

Feb 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lcsajdump-2.1.0.tar.gz (69.1 kB view details)

Uploaded Apr 28, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

lcsajdump-2.1.0-py3-none-any.whl (73.8 kB view details)

Uploaded Apr 28, 2026 Python 3

File details

Details for the file lcsajdump-2.1.0.tar.gz.

File metadata

Download URL: lcsajdump-2.1.0.tar.gz
Upload date: Apr 28, 2026
Size: 69.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for lcsajdump-2.1.0.tar.gz
Algorithm	Hash digest
SHA256	`5112aac72de1f46d6263f57991866be2f431422328376930612acd785d2b5260`
MD5	`3b988a4eaa3e39bab6bbec11a0f1d2b2`
BLAKE2b-256	`ce7ba11ba1f268ca05651fbb79fe6558294d09ccdcad9d1af15a06c13d3ba73f`

See more details on using hashes here.

File details

Details for the file lcsajdump-2.1.0-py3-none-any.whl.

File metadata

Download URL: lcsajdump-2.1.0-py3-none-any.whl
Upload date: Apr 28, 2026
Size: 73.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for lcsajdump-2.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e3392408ae4dd2e52776e7f198c073b60e89b6d00469a648786e9e247995a8fe`
MD5	`f2350869d517eeaaa9e98875b1147d99`
BLAKE2b-256	`be4389f9fbeb87efd27f460186972e63596dd7d1b104ad8aa74f2b1106a57c5c`

See more details on using hashes here.

lcsajdump 2.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

LCSAJdump

Universal Graph-Based Framework for Automated Gadget Discovery

Why LCSAJdump?

Key Features

Supported Architectures

Installation

Via Pip (Recommended)

From Source (Development)

Usage

CLI Options

Accuracy & Benchmarks

Developer & ML Guide

Contributing (Open for Forks!)

License

Project Link

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes