Skip to main content

A coverage-guided fuzzer for Python and Python extensions.

Project description

Atheris: A Coverage-Guided, Native Python Fuzzer

Atheris is a coverage-guided Python fuzzing engine. It supports fuzzing of Python code, but also native extensions written for CPython. Atheris is based off of libFuzzer. When fuzzing native code, Atheris can be used in combination with Address Sanitizer or Undefined Behavior Sanitizer to catch extra bugs.

Installation Instructions

Atheris supports Linux (32- and 64-bit) and Mac OS X, Python versions 3.6-3.9.

You can install prebuilt versions of Atheris with pip:

pip3 install atheris

These wheels come with a built-in libFuzzer, which is fine for fuzzing Python code. If you plan to fuzz native extensions, you may need to build from source to ensure the libFuzzer version in Atheris matches your Clang version.

Building from Source

Atheris relies on libFuzzer, which is distributed with Clang. If you have a sufficiently new version of clang on your path, installation from source is as simple as:

# Build latest release from source
pip3 install --no-binary atheris atheris
# Build development code from source
git clone https://github.com/google/atheris.git
cd atheris
pip3 install .

If you don't have clang installed or it's too old, you'll need to download and build the latest version of LLVM. Follow the instructions in Installing Against New LLVM below.

Mac

Apple Clang doesn't come with libFuzzer, so you'll need to install a new version of LLVM from head. Follow the instructions in Installing Against New LLVM below.

Installing Against New LLVM

# Building LLVM
git clone https://github.com/llvm/llvm-project.git
cd llvm-project
mkdir build
cd build
cmake -DLLVM_ENABLE_PROJECTS='clang;compiler-rt' -G "Unix Makefiles" ../llvm
make -j 10  # This step is very slow

# Installing Atheris
CLANG_BIN="$(pwd)/bin/clang" pip3 install <whatever>

Using Atheris

Example

import atheris

with atheris.instrument_imports():
  import some_library
  import sys

def TestOneInput(data):
  some_library.parse(data)

atheris.Setup(sys.argv, TestOneInput)
atheris.Fuzz()

When fuzzing Python, Atheris will report a failure if the Python code under test throws an uncaught exception.

Python coverage

Atheris collects Python coverage information by instrumenting bytecode. There are 3 options for adding this instrumentation to the bytecode:

  • You can instrument the libraries you import:

    with atheris.instrument_imports():
      import foo
      from bar import baz
    

    This will cause instrumentation to be added to foo and bar, as well as any libraries they import.

  • Or, you can instrument individual functions:

    @atheris.instrument_func
    def my_function(foo, bar):
      print("instrumented")
    
  • Or finally, you can instrument everything:

    atheris.instrument_all()
    

    Put this right before atheris.Setup(). This will find every Python function currently loaded in the interpreter, and instrument it. This might take a while.

Atheris can additionally instrument regular expression checks, e.g. re.search. To enable this feature, you will need to add: atheris.enabled_hooks.add("RegEx") To your script before your code calls re.compile. Internally this will import the re module and instrument the necessary functions. This is currently an experimental feature.

Why am I getting "No interesting inputs were found"?

You might see this error:

ERROR: no interesting inputs were found. Is the code instrumented for coverage? Exiting.

You'll get this error if the first 2 calls to TestOneInput didn't produce any coverage events. Even if you have instrumented some Python code, this can happen if the instrumentation isn't reached in those first 2 calls. (For example, because you have a nontrivial TestOneInput). You can resolve this by adding an atheris.instrument_func decorator to TestOneInput, using atheris.instrument_all(), or moving your TestOneInput function into an instrumented module.

Visualizing Python code coverage

Examining which lines are executed is helpful for understanding the effectiveness of your fuzzer. Atheris is compatible with coverage.py: you can run your fuzzer using the coverage.py module as you would for any other Python program. Here's an example:

python3 -m coverage run your_fuzzer.py -atheris_runs=10000  # Times to run
python3 -m coverage html
(cd htmlcov && python3 -m http.server 8000)

Coverage reports are only generated when your fuzzer exits gracefully. This happens if:

  • you specify -atheris_runs=<number>, and that many runs have elapsed.
  • your fuzzer exits by Python exception.
  • your fuzzer exits by sys.exit().

No coverage report will be generated if your fuzzer exits due to a crash in native code, or due to libFuzzer's -runs flag (use -atheris_runs). If your fuzzer exits via other methods, such as SIGINT (Ctrl+C), Atheris will attempt to generate a report but may be unable to (depending on your code). For consistent reports, we recommend always using -atheris_runs=<number>.

If you'd like to examine coverage when running with your corpus, you can do that with the following command:

python3 -m coverage run your_fuzzer.py corpus_dir/* -atheris_runs=$(ls corpus_dir | wc -l)

This will cause Atheris to run on each file in <corpus-dir>, then exit. Importantly, if you leave off the -atheris_runs=$(ls corpus_dir | wc -l), no coverage report will be generated.

Using coverage.py will significantly slow down your fuzzer, so only use it for visualizing coverage; don't use it all the time.

Fuzzing Native Extensions

In order for fuzzing native extensions to be effective, your native extensions must be instrumented. See Native Extension Fuzzing for instructions.

Integration with OSS-Fuzz

Atheris is fully supported by OSS-Fuzz, Google's continuous fuzzing service for open source projects. For integrating with OSS-Fuzz, please see https://google.github.io/oss-fuzz/getting-started/new-project-guide/python-lang.

API

The atheris module provides three key functions: instrument_imports(), Setup() and Fuzz().

In your source file, import all libraries you wish to fuzz inside a with atheris.instrument_imports():-block, like this:

# library_a will not get instrumented
import library_a

with atheris.instrument_imports():
    # library_b will get instrumented
    import library_b

Generally, it's best to import atheris first and then import all other libraries inside of a with atheris.instrument_imports() block.

Next, define a fuzzer entry point function and pass it to atheris.Setup() along with the fuzzer's arguments (typically sys.argv). Finally, call atheris.Fuzz() to start fuzzing. You must call atheris.Setup() before atheris.Fuzz().

instrument_imports(include=[], exclude=[])

  • include: A list of fully-qualified module names that shall be instrumented.
  • exclude: A list of fully-qualified module names that shall NOT be instrumented.

This should be used together with a with-statement. All modules imported in said statement will be instrumented. However, because Python imports all modules only once, this cannot be used to instrument any previously imported module, including modules required by Atheris. To add coverage to those modules, use instrument_all() instead.

A full list of unsupported modules can be retrieved as follows:

import sys
import atheris
print(sys.modules.keys())

instrument_func(func)

  • func: The function to instrument.

This will instrument the specified Python function and then return func. This is typically used as a decorator, but can be used to instrument individual functions too. Note that the func is instrumented in-place, so this will affect all call points of the function.

This cannot be called on a bound method - call it on the unbound version.

instrument_all()

This will scan over all objects in the interpreter and call instrument_func on every Python function. This works even on core Python interpreter functions, something which instrument_imports cannot do.

This function is experimental.

Setup(args, test_one_input, internal_libfuzzer=None)

  • args: A list of strings: the process arguments to pass to the fuzzer, typically sys.argv. This argument list may be modified in-place, to remove arguments consumed by the fuzzer. See the LibFuzzer docs for a list of such options.
  • test_one_input: your fuzzer's entry point. Must take a single bytes argument. This will be repeatedly invoked with a single bytes container.
  • internal_libfuzzer: Indicates whether libfuzzer will be provided by atheris or by an external library (see native_extension_fuzzing.md). If unspecified, Atheris will determine this automatically. If fuzzing pure Python, leave this as True.

Fuzz()

This starts the fuzzer. You must have called Setup() before calling this function. This function does not return.

In many cases Setup() and Fuzz() could be combined into a single function, but they are separated because you may want the fuzzer to consume the command-line arguments it handles before passing any remaining arguments to another setup function.

FuzzedDataProvider

Often, a bytes object is not convenient input to your code being fuzzed. Similar to libFuzzer, we provide a FuzzedDataProvider to translate these bytes into other input forms.

You can construct the FuzzedDataProvider with:

fdp = atheris.FuzzedDataProvider(input_bytes)

The FuzzedDataProvider then supports the following functions:

def ConsumeBytes(count: int)

Consume count bytes.

def ConsumeUnicode(count: int)

Consume unicode characters. Might contain surrogate pair characters, which according to the specification are invalid in this situation. However, many core software tools (e.g. Windows file paths) support them, so other software often needs to too.

def ConsumeUnicodeNoSurrogates(count: int)

Consume unicode characters, but never generate surrogate pair characters.

def ConsumeString(count: int)

Alias for ConsumeBytes in Python 2, or ConsumeUnicode in Python 3.

def ConsumeInt(int: bytes)

Consume a signed integer of the specified size (when written in two's complement notation).

def ConsumeUInt(int: bytes)

Consume an unsigned integer of the specified size.

def ConsumeIntInRange(min: int, max: int)

Consume an integer in the range [min, max].

def ConsumeIntList(count: int, bytes: int)

Consume a list of count integers of size bytes.

def ConsumeIntListInRange(count: int, min: int, max: int)

Consume a list of count integers in the range [min, max].

def ConsumeFloat()

Consume an arbitrary floating-point value. Might produce weird values like NaN and Inf.

def ConsumeRegularFloat()

Consume an arbitrary numeric floating-point value; never produces a special type like NaN or Inf.

def ConsumeProbability()

Consume a floating-point value in the range [0, 1].

def ConsumeFloatInRange(min: float, max: float)

Consume a floating-point value in the range [min, max].

def ConsumeFloatList(count: int)

Consume a list of count arbitrary floating-point values. Might produce weird values like NaN and Inf.

def ConsumeRegularFloatList(count: int)

Consume a list of count arbitrary numeric floating-point values; never produces special types like NaN or Inf.

def ConsumeProbabilityList(count: int)

Consume a list of count floats in the range [0, 1].

def ConsumeFloatListInRange(count: int, min: float, max: float)

Consume a list of count floats in the range [min, max]

def PickValueInList(l: list)

Given a list, pick a random value

def ConsumeBool()

Consume either True or False.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

atheris-2.0.7.tar.gz (65.4 kB view details)

Uploaded Source

Built Distributions

atheris-2.0.7-cp39-cp39-manylinux2014_x86_64.whl (26.9 MB view details)

Uploaded CPython 3.9

atheris-2.0.7-cp39-cp39-macosx_11_0_x86_64.whl (2.6 MB view details)

Uploaded CPython 3.9macOS 11.0+ x86-64

atheris-2.0.7-cp38-cp38-manylinux2014_x86_64.whl (26.9 MB view details)

Uploaded CPython 3.8

atheris-2.0.7-cp38-cp38-macosx_11_0_x86_64.whl (2.6 MB view details)

Uploaded CPython 3.8macOS 11.0+ x86-64

atheris-2.0.7-cp37-cp37m-manylinux2014_x86_64.whl (27.1 MB view details)

Uploaded CPython 3.7m

atheris-2.0.7-cp37-cp37m-macosx_11_0_x86_64.whl (2.6 MB view details)

Uploaded CPython 3.7mmacOS 11.0+ x86-64

atheris-2.0.7-cp36-cp36m-manylinux2014_x86_64.whl (27.1 MB view details)

Uploaded CPython 3.6m

atheris-2.0.7-cp36-cp36m-macosx_10_9_x86_64.whl (2.6 MB view details)

Uploaded CPython 3.6mmacOS 10.9+ x86-64

File details

Details for the file atheris-2.0.7.tar.gz.

File metadata

  • Download URL: atheris-2.0.7.tar.gz
  • Upload date:
  • Size: 65.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.26.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.52.0 CPython/3.8.6

File hashes

Hashes for atheris-2.0.7.tar.gz
Algorithm Hash digest
SHA256 c62959d809235fbfb1f0c9e1246eb14fb38ad645416c05825701c3b5442da1ea
MD5 8cb53244ac6606d3a9e54e8718e62ce7
BLAKE2b-256 7776df59c61321efb200a5b4681c11029ea472af1ba7390c396239b8f166533f

See more details on using hashes here.

File details

Details for the file atheris-2.0.7-cp39-cp39-manylinux2014_x86_64.whl.

File metadata

  • Download URL: atheris-2.0.7-cp39-cp39-manylinux2014_x86_64.whl
  • Upload date:
  • Size: 26.9 MB
  • Tags: CPython 3.9
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.26.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.52.0 CPython/3.8.6

File hashes

Hashes for atheris-2.0.7-cp39-cp39-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 52371a673057039b09292552b8eac0dbcf97916e7e6118636ff765d8d35370ac
MD5 fe163ccec34c4df9a194b92639a10a10
BLAKE2b-256 0d8dc0414dc3e1f175ccb065a16cca7065cc4622f105aaf4f90d0247d1a87195

See more details on using hashes here.

File details

Details for the file atheris-2.0.7-cp39-cp39-macosx_11_0_x86_64.whl.

File metadata

  • Download URL: atheris-2.0.7-cp39-cp39-macosx_11_0_x86_64.whl
  • Upload date:
  • Size: 2.6 MB
  • Tags: CPython 3.9, macOS 11.0+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.6.8

File hashes

Hashes for atheris-2.0.7-cp39-cp39-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 58767470fb33af93123c3a134dad41aef00c427a563534b17e7065c010870f76
MD5 468c6b0d92bf64132fff498bb0971224
BLAKE2b-256 d4b97b20f4bf7024e0f1d2ed33e16dad3c6842028a5c150a416fd3236ac8797c

See more details on using hashes here.

File details

Details for the file atheris-2.0.7-cp38-cp38-manylinux2014_x86_64.whl.

File metadata

  • Download URL: atheris-2.0.7-cp38-cp38-manylinux2014_x86_64.whl
  • Upload date:
  • Size: 26.9 MB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.26.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.52.0 CPython/3.8.6

File hashes

Hashes for atheris-2.0.7-cp38-cp38-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 20465aac446be3548c167eb64c902eff089f0d26b7612b9db3380984fd7ee0d9
MD5 a1134d986d20ce70ab001aa7845731a6
BLAKE2b-256 e52d3b13827e04187b58be9a5bfffe9959e1ca5e41b172e4f40d681b79818295

See more details on using hashes here.

File details

Details for the file atheris-2.0.7-cp38-cp38-macosx_11_0_x86_64.whl.

File metadata

  • Download URL: atheris-2.0.7-cp38-cp38-macosx_11_0_x86_64.whl
  • Upload date:
  • Size: 2.6 MB
  • Tags: CPython 3.8, macOS 11.0+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.6.8

File hashes

Hashes for atheris-2.0.7-cp38-cp38-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 3638ff8d1db3f00860bbaa42307b1a68bf3921bfd86792b0df67b10177c546fd
MD5 5dfb84d402b7f309b353460576eadec7
BLAKE2b-256 b5d496a1450dd053576c8586d32fbcc584d70f3bb8ac03030d9d659b9b3fa35c

See more details on using hashes here.

File details

Details for the file atheris-2.0.7-cp37-cp37m-manylinux2014_x86_64.whl.

File metadata

  • Download URL: atheris-2.0.7-cp37-cp37m-manylinux2014_x86_64.whl
  • Upload date:
  • Size: 27.1 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.26.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.52.0 CPython/3.8.6

File hashes

Hashes for atheris-2.0.7-cp37-cp37m-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f384b60eb5d5e2d8f360c33af2eae15a36b683a1c85c88f7b68f099e42f791dc
MD5 15bc0da9f7ed69b9f1527173227ba617
BLAKE2b-256 19f29f96ce7b892318b521945d388dd9558dc79371c26f85aed83ed5e5f6ca38

See more details on using hashes here.

File details

Details for the file atheris-2.0.7-cp37-cp37m-macosx_11_0_x86_64.whl.

File metadata

  • Download URL: atheris-2.0.7-cp37-cp37m-macosx_11_0_x86_64.whl
  • Upload date:
  • Size: 2.6 MB
  • Tags: CPython 3.7m, macOS 11.0+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.6.8

File hashes

Hashes for atheris-2.0.7-cp37-cp37m-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 3768b0793254df6e5f50eab2e95dc2b3f746930e912dbe8834123c3d19231872
MD5 7b3927af378007b0cf8318845b89aee1
BLAKE2b-256 130167434a6f8ac0ebb72efb29e4a384ac4889eda9836804704c04671a15f66e

See more details on using hashes here.

File details

Details for the file atheris-2.0.7-cp36-cp36m-manylinux2014_x86_64.whl.

File metadata

  • Download URL: atheris-2.0.7-cp36-cp36m-manylinux2014_x86_64.whl
  • Upload date:
  • Size: 27.1 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.26.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.52.0 CPython/3.8.6

File hashes

Hashes for atheris-2.0.7-cp36-cp36m-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7a44335d3c4fb3061e416becc4e6715aa9283ac0fb260c3c8d31cb251c15c072
MD5 71cc891c6e5a715239d6206d84236924
BLAKE2b-256 0899d0c4c6901ff393299b29e26d90aad93b91d2d3b4fd965373b707a654b9fa

See more details on using hashes here.

File details

Details for the file atheris-2.0.7-cp36-cp36m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: atheris-2.0.7-cp36-cp36m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 2.6 MB
  • Tags: CPython 3.6m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.6.8

File hashes

Hashes for atheris-2.0.7-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 26d4a0c08438f65f2014c5732804b609c463e8664b634b0c649b09ef53692a21
MD5 d8624a01bde804121bb9127799c2c7e9
BLAKE2b-256 4c57637d1dc4eede446574c00b4e503d4f837d85649077015d46f5e1765cd660

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page