Skip to main content

A coverage-guided fuzzer for Python and Python extensions.

Project description

Atheris: A Coverage-Guided, Native Python Fuzzer

Atheris is a coverage-guided Python fuzzing engine. It supports fuzzing of Python code, but also native extensions written for CPython. Atheris is based off of libFuzzer. When fuzzing native code, Atheris can be used in combination with Address Sanitizer or Undefined Behavior Sanitizer to catch extra bugs.

Installation Instructions

Atheris supports Linux (32- and 64-bit) and Mac OS X, Python versions 3.6-3.9.

You can install prebuilt versions of Atheris with pip:

pip3 install atheris

These wheels come with a built-in libFuzzer, which is fine for fuzzing Python code. If you plan to fuzz native extensions, you may need to build from source to ensure the libFuzzer version in Atheris matches your Clang version.

Building from Source

Atheris relies on libFuzzer, which is distributed with Clang. If you have a sufficiently new version of clang on your path, installation from source is as simple as:

# Build latest release from source
pip3 install --no-binary atheris atheris
# Build development code from source
git clone https://github.com/google/atheris.git
cd atheris
pip3 install .

If you don't have clang installed or it's too old, you'll need to download and build the latest version of LLVM. Follow the instructions in Installing Against New LLVM below.

Mac

Apple Clang doesn't come with libFuzzer, so you'll need to install a new version of LLVM from head. Follow the instructions in Installing Against New LLVM below.

Installing Against New LLVM

# Building LLVM
git clone https://github.com/llvm/llvm-project.git
cd llvm-project
mkdir build
cd build
cmake -DLLVM_ENABLE_PROJECTS='clang;compiler-rt' -G "Unix Makefiles" ../llvm
make -j 10  # This step is very slow

# Installing Atheris
CLANG_BIN="$(pwd)/bin/clang" pip3 install <whatever>

Using Atheris

Example:

import atheris

with atheris.instrument_imports():
  import some_library
  import sys

def TestOneInput(data):
  some_library.parse(data)

atheris.Setup(sys.argv, TestOneInput)
atheris.Fuzz()

When fuzzing Python, Atheris will report a failure if the Python code under test throws an uncaught exception.

Python coverage

Atheris collects Python coverage information by instrumenting bytecode. There are 3 options for adding this instrumentation to the bytecode:

  • You can instrument the libraries you import:
    with atheris.instrument_imports():
      import foo
      from bar import baz
    
    This will cause instrumentation to be added to foo and bar, as well as any libraries they import.
  • Or, you can instrument individual functions:
    @atheris.instrument_func
    def my_function(foo, bar):
      print("instrumented")
    
  • Or finally, you can instrument everything:
    atheris.instrument_all()
    
    Put this right before atheris.Setup(). This will find every Python function currently loaded in the interpreter, and instrument it. This might take a while.

Why am I getting "No interesting inputs were found"?

You might see this error:

ERROR: no interesting inputs were found. Is the code instrumented for coverage? Exiting.

You'll get this error if the first 2 calls to TestOneInput didn't produce any coverage events. Even if you have instrumented some Python code, this can happen if the instrumentation isn't reached in those first 2 calls. (For example, because you have a nontrivial TestOneInput). You can resolve this by adding an atheris.instrument_func decorator to TestOneInput, using atheris.instrument_all(), or moving your TestOneInput function into an instrumented module.

Fuzzing Native Extensions

In order for fuzzing native extensions to be effective, your native extensions must be instrumented. See Native Extension Fuzzing for instructions.

Integration with OSS-Fuzz

Atheris is fully supported by OSS-Fuzz, Google's continuous fuzzing service for open source projects. For integrating with OSS-Fuzz, please see https://google.github.io/oss-fuzz/getting-started/new-project-guide/python-lang.

API

The atheris module provides three key functions: instrument_imports(), Setup() and Fuzz().

In your source file, import all libraries you wish to fuzz inside a with atheris.instrument_imports():-block, like this:

# library_a will not get instrumented
import library_a

with atheris.instrument_imports():
    # library_b will get instrumented
    import library_b

Generally, it's best to import atheris first and then import all other libraries inside of a with atheris.instrument_imports() block.

Next, define a fuzzer entry point function and pass it to atheris.Setup() along with the fuzzer's arguments (typically sys.argv). Finally, call atheris.Fuzz() to start fuzzing. You must call atheris.Setup() before atheris.Fuzz().

instrument_imports(include=[], exclude=[])

  • include: A list of fully-qualified module names that shall be instrumented.
  • exclude: A list of fully-qualified module names that shall NOT be instrumented.

This should be used together with a with-statement. All modules imported in said statement will be instrumented. However, because Python imports all modules only once, this cannot be used to instrument any previously imported module, including modules required by Atheris. To add coverage to those modules, use instrument_all() instead.

A full list of unsupported modules can be retrieved as follows:

import sys
import atheris
print(sys.modules.keys())

instrument_func(func)

  • func: The function to instrument.

This will instrument the specified Python function and then return func. This is typically used as a decorator, but can be used to instrument individual functions too. Note that the func is instrumented in-place, so this will affect all call points of the function.

This cannot be called on a bound method - call it on the unbound version.

instrument_all()

This will scan over all objects in the interpreter and call instrument_func on every Python function. This works even on core Python interpreter functions, something which instrument_imports cannot do.

This function is experimental.

Setup(args, test_one_input, internal_libfuzzer=None)

  • args: A list of strings: the process arguments to pass to the fuzzer, typically sys.argv. This argument list may be modified in-place, to remove arguments consumed by the fuzzer. See the LibFuzzer docs for a list of such options.
  • test_one_input: your fuzzer's entry point. Must take a single bytes argument. This will be repeatedly invoked with a single bytes container.
  • internal_libfuzzer: Indicates whether libfuzzer will be provided by atheris or by an external library (see using_sanitizers.md). If unspecified, Atheris will determine this automatically. If fuzzing pure Python, leave this as True.

Fuzz()

This starts the fuzzer. You must have called Setup() before calling this function. This function does not return.

In many cases Setup() and Fuzz() could be combined into a single function, but they are separated because you may want the fuzzer to consume the command-line arguments it handles before passing any remaining arguments to another setup function.

FuzzedDataProvider

Often, a bytes object is not convenient input to your code being fuzzed. Similar to libFuzzer, we provide a FuzzedDataProvider to translate these bytes into other input forms. Alternatively, you can use Hypothesis as described below.

You can construct the FuzzedDataProvider with:

fdp = atheris.FuzzedDataProvider(input_bytes)

The FuzzedDataProvider then supports the following functions:

def ConsumeBytes(count: int)

Consume count bytes.

def ConsumeUnicode(count: int)

Consume unicode characters. Might contain surrogate pair characters, which according to the specification are invalid in this situation. However, many core software tools (e.g. Windows file paths) support them, so other software often needs to too.

def ConsumeUnicodeNoSurrogates(count: int)

Consume unicode characters, but never generate surrogate pair characters.

def ConsumeString(count: int)

Alias for ConsumeBytes in Python 2, or ConsumeUnicode in Python 3.

def ConsumeInt(int: bytes)

Consume a signed integer of the specified size (when written in two's complement notation).

def ConsumeUInt(int: bytes)

Consume an unsigned integer of the specified size.

def ConsumeIntInRange(min: int, max: int)

Consume an integer in the range [min, max].

def ConsumeIntList(count: int, bytes: int)

Consume a list of count integers of size bytes.

def ConsumeIntListInRange(count: int, min: int, max: int)

Consume a list of count integers in the range [min, max].

def ConsumeFloat()

Consume an arbitrary floating-point value. Might produce weird values like NaN and Inf.

def ConsumeRegularFloat()

Consume an arbitrary numeric floating-point value; never produces a special type like NaN or Inf.

def ConsumeProbability()

Consume a floating-point value in the range [0, 1].

def ConsumeFloatInRange(min: float, max: float)

Consume a floating-point value in the range [min, max].

def ConsumeFloatList(count: int)

Consume a list of count arbitrary floating-point values. Might produce weird values like NaN and Inf.

def ConsumeRegularFloatList(count: int)

Consume a list of count arbitrary numeric floating-point values; never produces special types like NaN or Inf.

def ConsumeProbabilityList(count: int)

Consume a list of count floats in the range [0, 1].

def ConsumeFloatListInRange(count: int, min: float, max: float)

Consume a list of count floats in the range [min, max]

def PickValueInList(l: list)

Given a list, pick a random value

def ConsumeBool()

Consume either True or False.

Use with Hypothesis

The Hypothesis library for property-based testing is also useful for writing fuzz harnesses. As well as a great library of "strategies" which describe the inputs to generate, using Hypothesis makes it trivial to reproduce failures found by the fuzzer - including automatically finding a minimal reproducing input. For example:

import atheris
from hypothesis import given, strategies as st

@given(st.from_regex(r"\w+!?", fullmatch=True))
@atheris.instrument_func
def test(string):
  assert string != "bad"

atheris.Setup(sys.argv, test.hypothesis.fuzz_one_input)
atheris.Fuzz()

See here for more details, or here for what you can generate.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

atheris-2.0.0.tar.gz (55.6 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

atheris-2.0.0-cp39-cp39-manylinux2014_x86_64.whl (26.7 MB view details)

Uploaded CPython 3.9

atheris-2.0.0-cp39-cp39-macosx_11_0_x86_64.whl (2.6 MB view details)

Uploaded CPython 3.9macOS 11.0+ x86-64

atheris-2.0.0-cp38-cp38-manylinux2014_x86_64.whl (26.7 MB view details)

Uploaded CPython 3.8

atheris-2.0.0-cp38-cp38-macosx_11_0_x86_64.whl (2.6 MB view details)

Uploaded CPython 3.8macOS 11.0+ x86-64

atheris-2.0.0-cp37-cp37m-manylinux2014_x86_64.whl (26.7 MB view details)

Uploaded CPython 3.7m

atheris-2.0.0-cp37-cp37m-macosx_11_0_x86_64.whl (2.6 MB view details)

Uploaded CPython 3.7mmacOS 11.0+ x86-64

atheris-2.0.0-cp36-cp36m-manylinux2014_x86_64.whl (26.7 MB view details)

Uploaded CPython 3.6m

atheris-2.0.0-cp36-cp36m-macosx_10_9_x86_64.whl (2.6 MB view details)

Uploaded CPython 3.6mmacOS 10.9+ x86-64

File details

Details for the file atheris-2.0.0.tar.gz.

File metadata

  • Download URL: atheris-2.0.0.tar.gz
  • Upload date:
  • Size: 55.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.52.0 CPython/3.8.6

File hashes

Hashes for atheris-2.0.0.tar.gz
Algorithm Hash digest
SHA256 5f058f24b0d654b7afb50914dfef0fadd861adc6c3454d40707b632a46514750
MD5 46a6b89dfebf187f28cdf7ef432f4f46
BLAKE2b-256 c1bc506cbd92b225dd198eb59c12269424067dabea35a90e0a28e14615aee44e

See more details on using hashes here.

File details

Details for the file atheris-2.0.0-cp39-cp39-manylinux2014_x86_64.whl.

File metadata

  • Download URL: atheris-2.0.0-cp39-cp39-manylinux2014_x86_64.whl
  • Upload date:
  • Size: 26.7 MB
  • Tags: CPython 3.9
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.52.0 CPython/3.8.6

File hashes

Hashes for atheris-2.0.0-cp39-cp39-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d247750c993c5c8dc33fad6331d51c757288fee8e162555430f18c081dcd3a95
MD5 cfcee1d45e53808ee1d1d01bd0cfb52b
BLAKE2b-256 3e469bd3fcdaf402af95afdb1bd25e29f66f8368f8c2fa98ed48645e9935d8f3

See more details on using hashes here.

File details

Details for the file atheris-2.0.0-cp39-cp39-macosx_11_0_x86_64.whl.

File metadata

  • Download URL: atheris-2.0.0-cp39-cp39-macosx_11_0_x86_64.whl
  • Upload date:
  • Size: 2.6 MB
  • Tags: CPython 3.9, macOS 11.0+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.6.8

File hashes

Hashes for atheris-2.0.0-cp39-cp39-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 50ca67e3994e9b383c2023f1745035b7a0fe8cebdfde647af2c8119806d91b00
MD5 87f0cb8909f044ab412c17aec07dd8c7
BLAKE2b-256 aba7b1d5ed82719c895e0048dc22dbcc32a42445a2a04431d3c5beddee2966b4

See more details on using hashes here.

File details

Details for the file atheris-2.0.0-cp38-cp38-manylinux2014_x86_64.whl.

File metadata

  • Download URL: atheris-2.0.0-cp38-cp38-manylinux2014_x86_64.whl
  • Upload date:
  • Size: 26.7 MB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.52.0 CPython/3.8.6

File hashes

Hashes for atheris-2.0.0-cp38-cp38-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 615fc11bce18cb0e8f8e8419104b037e33f5966e2196334666a617dc2426b07e
MD5 2ee48bcac40a4a231b1d9dea533084b0
BLAKE2b-256 35214c90e3d06dca702647b2e6c4b229555982788af4ba34995b0bea3b8ff597

See more details on using hashes here.

File details

Details for the file atheris-2.0.0-cp38-cp38-macosx_11_0_x86_64.whl.

File metadata

  • Download URL: atheris-2.0.0-cp38-cp38-macosx_11_0_x86_64.whl
  • Upload date:
  • Size: 2.6 MB
  • Tags: CPython 3.8, macOS 11.0+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.6.8

File hashes

Hashes for atheris-2.0.0-cp38-cp38-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 4045628e361b8c6e4e8be88e856be849a040aac9e99c9a22bea3b65941e80a40
MD5 899f0e2cc8377b21081d60ebe6e9eb08
BLAKE2b-256 090a595d4adee08b9ae9cacfb89587feef570f3e25bd5143c9d1c2741bd3c7f7

See more details on using hashes here.

File details

Details for the file atheris-2.0.0-cp37-cp37m-manylinux2014_x86_64.whl.

File metadata

  • Download URL: atheris-2.0.0-cp37-cp37m-manylinux2014_x86_64.whl
  • Upload date:
  • Size: 26.7 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.52.0 CPython/3.8.6

File hashes

Hashes for atheris-2.0.0-cp37-cp37m-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f483cf16109535b00b858d833eb0b4a1e30c96a0c45b9cc42ab0b62ee55651d8
MD5 a19e1b64c64b975f7cecb69bfb88e29f
BLAKE2b-256 dc26f52afce3fdbf7d25f076daa76654e2aeb627704bda7973c8242a8f92d3a9

See more details on using hashes here.

File details

Details for the file atheris-2.0.0-cp37-cp37m-macosx_11_0_x86_64.whl.

File metadata

  • Download URL: atheris-2.0.0-cp37-cp37m-macosx_11_0_x86_64.whl
  • Upload date:
  • Size: 2.6 MB
  • Tags: CPython 3.7m, macOS 11.0+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.6.8

File hashes

Hashes for atheris-2.0.0-cp37-cp37m-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 42324a823811f935986fc1797522685e924133ee6a5dd412bcbb0564339d653f
MD5 9d4b8ba51c7a85511ccc767678a79762
BLAKE2b-256 021fe97e14f3bc4b670e028ed54ecc40ec2ffa9d441cd596b6a93ab1ea0f1cf3

See more details on using hashes here.

File details

Details for the file atheris-2.0.0-cp36-cp36m-manylinux2014_x86_64.whl.

File metadata

  • Download URL: atheris-2.0.0-cp36-cp36m-manylinux2014_x86_64.whl
  • Upload date:
  • Size: 26.7 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.52.0 CPython/3.8.6

File hashes

Hashes for atheris-2.0.0-cp36-cp36m-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8f28e11bdf65712942c2324b95e1c78acd9d7281dbdf49083692676cd6950e16
MD5 6cf10209f785d27ba73ac6f22b74fac5
BLAKE2b-256 f3b500779f11b8e2e9ef95babf9724f8c74aaa7b8bf8d2fc4b98d96c9729b195

See more details on using hashes here.

File details

Details for the file atheris-2.0.0-cp36-cp36m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: atheris-2.0.0-cp36-cp36m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 2.6 MB
  • Tags: CPython 3.6m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.6.8

File hashes

Hashes for atheris-2.0.0-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 6a1197e7f333f07eba077a1067b73a344064ea31529e2c26de5bcc35e1d27575
MD5 2c51abcebe88feec47a089a7b9e1d4d7
BLAKE2b-256 536d585ceb6d025cb10da60279cc2f6af71badf5806c7c6efa6fdced9461f3aa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page