Skip to main content

A fast and accurate disassembler

Project description

Datalog Disassembly

DDisasm is a fast disassembler which is accurate enough for the resulting assembly code to be reassembled. DDisasm is implemented using the datalog (souffle) declarative logic programming language to compile disassembly rules and heuristics. The disassembler first parses ELF/PE file information and decodes a superset of possible instructions to create an initial set of datalog facts. These facts are analyzed to identify code location, symbolization, and function boundaries. The results of this analysis, a refined set of datalog facts, are then translated to the GTIRB intermediate representation for binary analysis and reverse engineering. The GTIRB pretty printer may then be used to pretty print the GTIRB to reassemblable assembly code.

Binary Support

Binary formats:

  • ELF (Linux)
  • PE (Windows)

Instruction Set Architectures (ISAs):

  • x86_32
  • x86_64
  • ARM32
  • ARM64
  • MIPS32

Getting Started

You can run a prebuilt version of Ddisasm using Docker:

docker pull grammatech/ddisasm:latest

Ddisasm can be used to disassemble a binary into the GTIRB representation. We can try it with one of the examples included in the repository.

First, start the Ddisasm docker container:

docker run -v $PWD/examples:/examples -it grammatech/ddisasm:latest

Within the Docker container, let us build one of the examples:

apt update && apt install gcc -y
cd /examples/ex1
gcc ex.c -o ex

Now we can proceed to disassemble the binary:

ddisasm ex --ir ex.gtirb

Once you have the GTIRB representation, you can make programmatic changes to the binary using GTIRB or gtirb-rewriting.

Then, you can use gtirb-pprinter (included in the Docker image) to produce a new version of the binary:

gtirb-pprinter ex.gtirb -b ex_rewritten

Internally, gtirb-pprinter will generate an assembly file and invoke the compiler/assembler (e.g. gcc) to produce a new binary. gtirb-pprinter will take care or generating all the necessary command line options to generate a new binary, including compilation options, library dependencies, or version linker scripts.

You can also use gtirb-pprinter to generate an assembly listing for manual modification:

gtirb-pprinter ex.gtirb --asm ex.s

This assembly listing can then be manually recompiled:

gcc -nostartfiles ex.s -o ex_rewritten

Please take a look at our documentation for additional information.

Documentation

Contributing

See CONTRIBUTING.md

External Contributors

  • Programming Language Group, The University of Sydney: Initial support for ARM64.
  • Github user gogo2464: Documentation refactoring.

Cite

  1. Datalog Disassembly
@inproceedings {flores-montoya2020,
    author = {Antonio Flores-Montoya and Eric Schulte},
    title = {Datalog Disassembly},
    booktitle = {29th USENIX Security Symposium (USENIX Security 20)},
    year = {2020},
    isbn = {978-1-939133-17-5},
    pages = {1075--1092},
    url = {https://www.usenix.org/conference/usenixsecurity20/presentation/flores-montoya},
    publisher = {USENIX Association},
    month = aug,
}
  1. GTIRB
@misc{schulte2020gtirb,
    title={GTIRB: Intermediate Representation for Binaries},
    author={Eric Schulte and Jonathan Dorn and Antonio Flores-Montoya and Aaron Ballman and Tom Johnson},
    year={2020},
    eprint={1907.02859},
    archivePrefix={arXiv},
    primaryClass={cs.PL}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

ddisasm-1.9.0-py3-none-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (25.6 MB view details)

Uploaded Python 3 manylinux: glibc 2.12+ x86-64

File details

Details for the file ddisasm-1.9.0-py3-none-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for ddisasm-1.9.0-py3-none-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 4a1d25a7edc161217fb568d9a9eb8067a29f64ab9277797b183b5cc4136dd841
MD5 6b58f923c9953302310baa0d792a575d
BLAKE2b-256 96412612e021e4f5a8d4e7858fccf47c3f646641d812a45a79ec662255a3b237

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page