Skip to main content

Dynamic taint analysis of Python web applications using monkey patching.

Project description

TaintMonkey: Dynamic Taint Analysis of Python Web Applications Using Monkey Patching

TaintMonkey banner

CI - Run Unit Tests Wheel PyPI

TaintMonkey is a a dynamic taint analysis library for Python Flask web applications. It leverages monkey patching to instrument Flask applications without modifying source code. TaintMonkey includes a built-in fuzzer that helps developers test endpoints for specific vulnerabilities with randomized inputs. This repository also comes with JungleGym, a datatset of 100+ example Flask applications susceptible to web vulnerabilities from the Common Weakness Enumeration (CWE).

TaintMonkey components

Installation

To install the latest version of TaintMonkey, you can run the following command.

pip install taintmonkey

Usage

In order to test a Flask endpoint for a particular vulnerability with TaintMonkey, you must first create a plugin.

TaintMonkey dataflow

Step 1: Monkey Patch the Source

Monkey patch your endpoint's source to return a tainted string.

Example for OS Command Injection:

@patch_function("dataset.cwe_78_os_command_injection.insecure_novalidation.app.open_file_command")
def new_open_file_command(file: TaintedStr):
    return TaintedStr(original_function(file))

Step 2: Create taintmonkey() Fixture

Write a taintmonkey() fixture that passes your app's verifier, sanitizer, and sink functions to the TaintMonkey class. TaintMonkey automatically monkey patches these functions to add taint analysis instrumentation. Next, initialize and set a fuzzer (dictionary, mutation, or grammar-based) for TaintMonkey to use.

Example:

VERIFIERS = []
SANITIZERS = []
SINKS = ["os.popen"]

@pytest.fixture()
def taintmonkey():
    from dataset.cwe_78_os_command_injection.insecure_novalidation.app import app

    tm = TaintMonkey(app, verifiers=VERIFIERS, sanitizers=SANITIZERS, sinks=SINKS)

    fuzzer = MutationBasedFuzzer(app, "plugins/cwe_78_os_command_injection/corpus.txt")
    tm.set_fuzzer(fuzzer)

    return tm

Step 3: Write The Fuzzing Harness

The fuzzing harness is how a TaintMonkey plugin uses inputs generated by the fuzzer to test an endpoint for vulnerabilities. Use the fuzzer's context manager to get a TaintClient object and input generator. Then iterate through the generated inputs and make requests to the endpoint using those inputs.

Example:

def test_fuzz(taintmonkey):
    fuzzer = taintmonkey.get_fuzzer()

    counter = 0
    print()
    with fuzzer.get_context() as (client, input_generator):
        for _, data in zip(range(10), input_generator):
            print(f"[Fuzz Attempt {counter}] {data}")

            client.get(f"/insecure?{urlencode({'file': data})}")
            counter += 1

Step 4: Run Plugin

Run the plugin to test if your Flask endpoint is vulnerable.

Example:

PYTHONPATH=. pytest -s plugins/cwe_78_os_command_injection/__init__.py

During execution, a TaintException is raised if tainted input reaches a sink without proper verification or sanitization (assuming that verifiers, sanitizers, and sinks have been correctly registered with the TaintMonkey object).

Development

To download the necessary packages for TaintMonkey, run

pip install -r requirements.txt

We use ruff to check the formatting of our code so before submitting a Pull Request, make sure to run the formatter using the following command.

python -m ruff format --no-cache

To run the unit test suite, use the following command.

PYTHONPATH=. pytest

To generate a coverage report of TaintMonkey, run the following commands.

PYTHONPATH=. pytest --cov=taintmonkey --cov-report html tests/
cd htmlcov/
python3 -m http.server

The HTML report generated by coverage-py should be available at http://localhost:8000.

Experiments

In order to run experiments using the JungleGym dataset, make sure to set up the environment by doing the following.

python3 -m venv venv
source venv/bin/activate
bash experiments/setup.sh

Authors

TaintMonkey was developed by Shayan Chatiwala, Aiden Chen, Carter Chew, Sebastian Mercado, and Aarav Parikh for GSET 2025. The project was advised by Benson Liu as their project mentor and Anusha Iyer as their project Residential Teaching Assistant (RTA). For any questions or requests for additional information, please contact the authors.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

taintmonkey-1.0.1.tar.gz (22.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

taintmonkey-1.0.1-py3-none-any.whl (17.6 kB view details)

Uploaded Python 3

File details

Details for the file taintmonkey-1.0.1.tar.gz.

File metadata

  • Download URL: taintmonkey-1.0.1.tar.gz
  • Upload date:
  • Size: 22.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for taintmonkey-1.0.1.tar.gz
Algorithm Hash digest
SHA256 b0da6251a7e533c169a277015e1a65fd66231ac6e551f6a1be763b2cf152977c
MD5 0d7a16edf3cf1e52a0c4007b702fa79e
BLAKE2b-256 7e001560c22cbc03427eb9082ba422508a7d2c05bf2c563cb5bdeb43f8dc330e

See more details on using hashes here.

File details

Details for the file taintmonkey-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: taintmonkey-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 17.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for taintmonkey-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 dce308c7a4c83f08bbfc205c8984b015b5e569359c1a724ac30058933c2fca99
MD5 84595312cd49d00ab8ad37a7656ea4ad
BLAKE2b-256 4189988513408f66fb543d76586fa36931627262ecf17ff54796b5ec499707ef

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page