Skip to main content

A Python interface to libVEX and VEX IR

Project description

PyVEX

Latest Release Python Version PyPI Statistics License

PyVEX is Python bindings for the VEX IR.

Project Links

Project repository: https://github.com/angr/pyvex

Documentation: https://api.angr.io/projects/pyvex/en/latest/

Installing PyVEX

PyVEX can be pip-installed:

pip install pyvex

Using PyVEX

import pyvex
import archinfo

# translate an AMD64 basic block (of nops) at 0x400400 into VEX
irsb = pyvex.lift(b"\x90\x90\x90\x90\x90", 0x400400, archinfo.ArchAMD64())

# pretty-print the basic block
irsb.pp()

# this is the IR Expression of the jump target of the unconditional exit at the end of the basic block
print(irsb.next)

# this is the type of the unconditional exit (i.e., a call, ret, syscall, etc)
print(irsb.jumpkind)

# you can also pretty-print it
irsb.next.pp()

# iterate through each statement and print all the statements
for stmt in irsb.statements:
    stmt.pp()

# pretty-print the IR expression representing the data, and the *type* of that IR expression written by every store statement
import pyvex
for stmt in irsb.statements:
    if isinstance(stmt, pyvex.IRStmt.Store):
        print("Data:", end="")
        stmt.data.pp()
        print("")

        print("Type:", end="")
        print(stmt.data.result_type)
        print("")

# pretty-print the condition and jump target of every conditional exit from the basic block
for stmt in irsb.statements:
    if isinstance(stmt, pyvex.IRStmt.Exit):
        print("Condition:", end="")
        stmt.guard.pp()
        print("")

        print("Target:", end="")
        stmt.dst.pp()
        print("")

# these are the types of every temp in the IRSB
print(irsb.tyenv.types)

# here is one way to get the type of temp 0
print(irsb.tyenv.types[0])

Keep in mind that this is a syntactic representation of a basic block. That is, it'll tell you what the block means, but you don't have any context to say, for example, what actual data is written by a store instruction.

VEX Intermediate Representation

To deal with widely diverse architectures, it is useful to carry out analyses on an intermediate representation. An IR abstracts away several architecture differences when dealing with different architectures, allowing a single analysis to be run on all of them:

  • Register names. The quantity and names of registers differ between architectures, but modern CPU designs hold to a common theme: each CPU contains several general purpose registers, a register to hold the stack pointer, a set of registers to store condition flags, and so forth. The IR provides a consistent, abstracted interface to registers on different platforms. Specifically, VEX models the registers as a separate memory space, with integer offsets (i.e., AMD64's rax is stored starting at address 16 in this memory space).
  • Memory access. Different architectures access memory in different ways. For example, ARM can access memory in both little-endian and big-endian modes. The IR must abstract away these differences.
  • Memory segmentation. Some architectures, such as x86, support memory segmentation through the use of special segment registers. The IR understands such memory access mechanisms.
  • Instruction side-effects. Most instructions have side-effects. For example, most operations in Thumb mode on ARM update the condition flags, and stack push/pop instructions update the stack pointer. Tracking these side-effects in an ad hoc manner in the analysis would be crazy, so the IR makes these effects explicit.

There are lots of choices for an IR. We use VEX, since the uplifting of binary code into VEX is quite well supported. VEX is an architecture-agnostic, side-effects-free representation of a number of target machine languages. It abstracts machine code into a representation designed to make program analysis easier. This representation has five main classes of objects:

  • Expressions. IR Expressions represent a calculated or constant value. This includes memory loads, register reads, and results of arithmetic operations.
  • Operations. IR Operations describe a modification of IR Expressions. This includes integer arithmetic, floating-point arithmetic, bit operations, and so forth. An IR Operation applied to IR Expressions yields an IR Expression as a result.
  • Temporary variables. VEX uses temporary variables as internal registers: IR Expressions are stored in temporary variables between use. The content of a temporary variable can be retrieved using an IR Expression. These temporaries are numbered, starting at t0. These temporaries are strongly typed (i.e., "64-bit integer" or "32-bit float").
  • Statements. IR Statements model changes in the state of the target machine, such as the effect of memory stores and register writes. IR Statements use IR Expressions for values they may need. For example, a memory store IR Statement uses an IR Expression for the target address of the write, and another IR Expression for the content.
  • Blocks. An IR Block is a collection of IR Statements, representing an extended basic block (termed "IR Super Block" or "IRSB") in the target architecture. A block can have several exits. For conditional exits from the middle of a basic block, a special Exit IR Statement is used. An IR Expression is used to represent the target of the unconditional exit at the end of the block.

VEX IR is actually quite well documented in the libvex_ir.h file (https://github.com/angr/vex/blob/dev/pub/libvex_ir.h) in the VEX repository. For the lazy, we'll detail some parts of VEX that you'll likely interact with fairly frequently. To begin with, here are some IR Expressions:

IR Expression Evaluated Value VEX Output Example
Constant A constant value. 0x4:I32
Read Temp The value stored in a VEX temporary variable. RdTmp(t10)
Get Register The value stored in a register. GET:I32(16)
Load Memory The value stored at a memory address, with the address specified by another IR Expression. LDle:I32 / LDbe:I64
Operation A result of a specified IR Operation, applied to specified IR Expression arguments. Add32
If-Then-Else If a given IR Expression evaluates to 0, return one IR Expression. Otherwise, return another. ITE
Helper Function VEX uses C helper functions for certain operations, such as computing the conditional flags registers of certain architectures. These functions return IR Expressions. function_name()

These expressions are then, in turn, used in IR Statements. Here are some common ones:

IR Statement Meaning VEX Output Example
Write Temp Set a VEX temporary variable to the value of the given IR Expression. WrTmp(t1) = (IR Expression)
Put Register Update a register with the value of the given IR Expression. PUT(16) = (IR Expression)
Store Memory Update a location in memory, given as an IR Expression, with a value, also given as an IR Expression. STle(0x1000) = (IR Expression)
Exit A conditional exit from a basic block, with the jump target specified by an IR Expression. The condition is specified by an IR Expression. if (condition) goto (Boring) 0x4000A00:I32

An example of an IR translation, on ARM, is produced below. In the example, the subtraction operation is translated into a single IR block comprising 5 IR Statements, each of which contains at least one IR Expression (although, in real life, an IR block would typically consist of more than one instruction). Register names are translated into numerical indices given to the GET Expression and PUT Statement. The astute reader will observe that the actual subtraction is modeled by the first 4 IR Statements of the block, and the incrementing of the program counter to point to the next instruction (which, in this case, is located at 0x59FC8) is modeled by the last statement.

The following ARM instruction:

subs R2, R2, #8

Becomes this VEX IR:

t0 = GET:I32(16)
t1 = 0x8:I32
t3 = Sub32(t0,t1)
PUT(16) = t3
PUT(68) = 0x59FC8:I32

Cool stuff!

Citing PyVEX

If you use PyVEX in an academic work, please cite the paper for which it was developed:

@article{shoshitaishvili2015firmalice,
  title={Firmalice - Automatic Detection of Authentication Bypass Vulnerabilities in Binary Firmware},
  author={Shoshitaishvili, Yan and Wang, Ruoyu and Hauser, Christophe and Kruegel, Christopher and Vigna, Giovanni},
  booktitle={NDSS},
  year={2015}
}

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyvex-9.2.222.tar.gz (3.6 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pyvex-9.2.222-cp312-abi3-win_amd64.whl (1.4 MB view details)

Uploaded CPython 3.12+Windows x86-64

pyvex-9.2.222-cp312-abi3-musllinux_1_2_x86_64.whl (1.9 MB view details)

Uploaded CPython 3.12+musllinux: musl 1.2+ x86-64

pyvex-9.2.222-cp312-abi3-musllinux_1_2_aarch64.whl (1.8 MB view details)

Uploaded CPython 3.12+musllinux: musl 1.2+ ARM64

pyvex-9.2.222-cp312-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (1.9 MB view details)

Uploaded CPython 3.12+manylinux: glibc 2.17+ x86-64manylinux: glibc 2.28+ x86-64

pyvex-9.2.222-cp312-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl (1.8 MB view details)

Uploaded CPython 3.12+manylinux: glibc 2.17+ ARM64manylinux: glibc 2.28+ ARM64

pyvex-9.2.222-cp312-abi3-macosx_11_0_arm64.whl (1.6 MB view details)

Uploaded CPython 3.12+macOS 11.0+ ARM64

File details

Details for the file pyvex-9.2.222.tar.gz.

File metadata

  • Download URL: pyvex-9.2.222.tar.gz
  • Upload date:
  • Size: 3.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for pyvex-9.2.222.tar.gz
Algorithm Hash digest
SHA256 ea384cfe50ed46a915ab8dadd6e774fda51434ff88dba7e33ebe3cb53f6cfcc2
MD5 7c9991f8ddfdfb0e564968326fc7acc1
BLAKE2b-256 15ab0d0122353f801539eb01005c3afeb7cba5533abb99c92077af7452dcd657

See more details on using hashes here.

Provenance

The following attestation bundles were made for pyvex-9.2.222.tar.gz:

Publisher: angr-release.yml on angr/ci-settings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pyvex-9.2.222-cp312-abi3-win_amd64.whl.

File metadata

  • Download URL: pyvex-9.2.222-cp312-abi3-win_amd64.whl
  • Upload date:
  • Size: 1.4 MB
  • Tags: CPython 3.12+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for pyvex-9.2.222-cp312-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 c665a551e117832ac2b60c8153c16e80bcc7c5fd016ee7fff728e6ac9cbe7441
MD5 ea3ef1776e6082ae969031661f479ba2
BLAKE2b-256 27baf1c6c72075690276d0ae89f714f145db3320b87a69f99fa0c440523c9df1

See more details on using hashes here.

Provenance

The following attestation bundles were made for pyvex-9.2.222-cp312-abi3-win_amd64.whl:

Publisher: angr-release.yml on angr/ci-settings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pyvex-9.2.222-cp312-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for pyvex-9.2.222-cp312-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 3e6f21d009d62c99fb949ccd168999ae647c78c87804384dc291a0137f0424bc
MD5 e418364c687870707cbcce47e24ce92a
BLAKE2b-256 eae0d68b521d7ee712d75c45f0f31f11c3d972e188a2a4d8039c1c8c1606ade5

See more details on using hashes here.

Provenance

The following attestation bundles were made for pyvex-9.2.222-cp312-abi3-musllinux_1_2_x86_64.whl:

Publisher: angr-release.yml on angr/ci-settings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pyvex-9.2.222-cp312-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for pyvex-9.2.222-cp312-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 df0252f611c252d6c05045a034b5c1bfbca233543e7a55463a842cf0697c2421
MD5 0649e85a2534ffe3bcc8d14c63175cab
BLAKE2b-256 48c039f155739523d7bf40307ba4f26359fd1c76c7850b60b4b5233937a39efe

See more details on using hashes here.

Provenance

The following attestation bundles were made for pyvex-9.2.222-cp312-abi3-musllinux_1_2_aarch64.whl:

Publisher: angr-release.yml on angr/ci-settings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pyvex-9.2.222-cp312-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for pyvex-9.2.222-cp312-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 bbc7211cc41114c036f98d802ead517f203da13aaa5ea34e9e97d6b335b5232b
MD5 1eb866edb3b204f8b518050cee85750c
BLAKE2b-256 472b5cbabbea7ff6e17fcb66fafc4fae669bb4887665cbc96cee4cd06aa3242b

See more details on using hashes here.

Provenance

The following attestation bundles were made for pyvex-9.2.222-cp312-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl:

Publisher: angr-release.yml on angr/ci-settings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pyvex-9.2.222-cp312-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for pyvex-9.2.222-cp312-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 03ed1912c359884e275117583bc7cf0c35009eba3d58112481d0c3bb4001771f
MD5 977ebf3af15b98b59a55194f8d84b9b7
BLAKE2b-256 0d08da02e6acbd9c2ac765bd63cda00ad990662366472fe1b0a87d0d8d78e9f0

See more details on using hashes here.

Provenance

The following attestation bundles were made for pyvex-9.2.222-cp312-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl:

Publisher: angr-release.yml on angr/ci-settings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pyvex-9.2.222-cp312-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pyvex-9.2.222-cp312-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 8d4757c661b58cb496cfc55ac7c38cb03e414fe9f2b4c249d18f405fcb632a64
MD5 2f35b24f1ad2e3087992ba63e8f008f1
BLAKE2b-256 a33e147b3b46fdcae251aad975df9a8b4a89af4a0b3b58f71d66d977ef4d7250

See more details on using hashes here.

Provenance

The following attestation bundles were made for pyvex-9.2.222-cp312-abi3-macosx_11_0_arm64.whl:

Publisher: angr-release.yml on angr/ci-settings

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page