Skip to main content

The VarBERT API for renaming variables in decompiled code.

Project description

VarBERT API

The VarBERT API is a Python library to access and use the latest models from the S&P 2024 work ""Len or index or count, anything but v1": Predicting Variable Names in Decompilation Output with Transfer Learning", featuring VarBERT. VarBERT is a BERT-based model that predicts variable names for decompiled code. To train new models and understand the pipeline, see the VarBERT paper repo. Specialized models exist for IDA Pro and Ghidra, but can be used on any decompiler.

DAILA context menu

The main focus of this project is to provide an library API and CLI access to VarBERT models, but, it has been designed to be used in decompiler directly using the DAILA project. DAILA comes with the VarBERT API bundled, so you do not need to install VarBERT if you are using DAILA.

Install

pip3 install varbert && varbert --download-models

This will install the VarBERT API library and download the models to be stored inside the VarBERT package. You can optionally provide a decompiler name to --download-models to only download the models for that decompiler.

Usage

The VarBERT API can be used in three ways:

  • From the CLI, directly on decompiled text (without an attached decompiler)
  • As a scripting library
  • As a decompiler plugin (using DALIA)

Command Line (without running a decompiler)

Note that VarBERT runs better when it is directly hooked up to a decompiler because it can use additional semantic information that the decompiler knows about the decompiled code. However, we do have the ability to run VarBERT without a running decompiler, only operating on the text from the command line.

Running the following will cause VarBERT to read a function from standard input and output the function with predicted variable names to standard out:

varbert --predict --decompiler ida

You can select different decompilers that will use different models that are trained on the different decompilers. If you do not specify a decompiler, the default is IDA Pro. As an example, you can also give no decompiler:

 echo "__int64 sub_400664(char *a1,char *a2)\n {}" | varbert -p

Scripting

Without Decompiler

from varbert import VariableRenamingAPI
api = VariableRenamingAPI(decompiler_name="ida", use_decompiler=False)
new_names, new_code = api.predict_variable_names(decompilation_text="__int64 sub_400664(char *a1,char *a2)\n {}", use_decompiler=False)
print(new_code)

You can also find more examples in the tests.py file.

Inside Decompiler

You can use VarBERT as a scripting library inside your decompiler, utilizing LibBS.

from varbert import VariableRenamingAPI
from libbs.api import DecompilerInterface
dec = DecompilerInterface()
api = VariableRenamingAPI(decompiler_interface=dec)
for func_addr in dec.functions:
    new_names, new_code = api.predict_variable_names(function=dec.functions[func_addr])
    print(new_names)

As a Decompiler Plugin

If you would like to use VarBERT as a decompiler plugin, you can use DAILA. You should follow the instructions on the DAILA repo to install DAILA, but it's generally as simple as:

pip3 install dailalib && daila --install

You can find a demo of VarBERT running inside DAILA below:

VarBERT Demo

Citing

If you use VarBERT in your research, please cite our paper:

TODO

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

varbert-2.3.0.tar.gz (20.1 kB view details)

Uploaded Source

Built Distribution

varbert-2.3.0-py3-none-any.whl (17.5 kB view details)

Uploaded Python 3

File details

Details for the file varbert-2.3.0.tar.gz.

File metadata

  • Download URL: varbert-2.3.0.tar.gz
  • Upload date:
  • Size: 20.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for varbert-2.3.0.tar.gz
Algorithm Hash digest
SHA256 5126655593a0ba917d0df18801d471d9cfb4dccf9ff4e66a92dd74bae7632f2f
MD5 83c1168b2c86a8d7d6290414fda9c650
BLAKE2b-256 d1b9da938d3182462ad0028edc07d614537b45e0f9a266075a5597327b9ba80e

See more details on using hashes here.

File details

Details for the file varbert-2.3.0-py3-none-any.whl.

File metadata

  • Download URL: varbert-2.3.0-py3-none-any.whl
  • Upload date:
  • Size: 17.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for varbert-2.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c3406c28cf00ecefb5ade9c79aec0f62aa959cccdf7d514f691d3bd0da294c50
MD5 20862a1e031e9d8821859c148553a381
BLAKE2b-256 d0f101947a182086444a7d462822b1e3d236c5cfa2535a5224d27861512ee248

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page