Skip to main content

Dynamic taint analysis of Python web applications using monkey patching.

Project description

TaintMonkey: Dynamic Taint Analysis of Python Web Applications Using Monkey Patching

TaintMonkey banner

Component Status
Tests CI - Run Unit Tests
Packages Wheel PyPI

TaintMonkey is a a dynamic taint analysis library for Python Flask web applications. It leverages monkey patching to instrument Flask applications without modifying source code. TaintMonkey includes a built-in fuzzer that helps developers test endpoints for specific vulnerabilities with randomized inputs. This repository also comes with JungleGym, a datatset of 100+ example Flask applications susceptible to web vulnerabilities from the Common Weakness Enumeration (CWE).

TaintMonkey components

Installation

To install the latest version of TaintMonkey, you can run the following command.

pip install taintmonkey

Usage

In order to test a Flask endpoint for a particular vulnerability with TaintMonkey, you must first create a plugin.

TaintMonkey dataflow

Step 1: Monkey Patch the Source

Monkey patch your endpoint's source to return a tainted string.

Example for OS Command Injection:

@patch_function("dataset.cwe_78_os_command_injection.insecure_novalidation.app.open_file_command")
def new_open_file_command(file: TaintedStr):
    return TaintedStr(original_function(file))

Step 2: Create taintmonkey() Fixture

Write a taintmonkey() fixture that passes your app's verifier, sanitizer, and sink functions to the TaintMonkey class. TaintMonkey automatically monkey patches these functions to add taint analysis instrumentation. Next, initialize and set a fuzzer (dictionary, mutation, or grammar-based) for TaintMonkey to use.

Example:

VERIFIERS = []
SANITIZERS = []
SINKS = ["os.popen"]

@pytest.fixture()
def taintmonkey():
    from dataset.cwe_78_os_command_injection.insecure_novalidation.app import app

    tm = TaintMonkey(app, verifiers=VERIFIERS, sanitizers=SANITIZERS, sinks=SINKS)

    fuzzer = MutationBasedFuzzer(app, "plugins/cwe_78_os_command_injection/corpus.txt")
    tm.set_fuzzer(fuzzer)

    return tm

Step 3: Write The Fuzzing Harness

The fuzzing harness is how a TaintMonkey plugin uses inputs generated by the fuzzer to test an endpoint for vulnerabilities. Use the fuzzer's context manager to get a TaintClient object and input generator. Then iterate through the generated inputs and make requests to the endpoint using those inputs.

Example:

def test_fuzz(taintmonkey):
    fuzzer = taintmonkey.get_fuzzer()
    with fuzzer.get_context() as (client, get_input):
        for inp in get_input():
            client.get(f"/insecure?file={inp}")

Step 4: Run Plugin

Run the plugin to test if your Flask endpoint is vulnerable.

Example:

PYTHONPATH=. pytest -s plugins/cwe_78_os_command_injection/__init__.py

During execution, a TaintException is raised if tainted input reaches a sink without proper verification or sanitization (assuming that verifiers, sanitizers, and sinks have been correctly registered with the TaintMonkey object).

Development

To download the necessary packages for TaintMonkey, run

pip install -r requirements.txt

We use ruff to check the formatting of our code so before submitting a Pull Request, make sure to run the formatter using the following command.

python -m ruff format --no-cache

To run the unit test suite, use the following command.

PYTHONPATH=. pytest tests/

To generate a coverage report of TaintMonkey, run the following commands.

PYTHONPATH=. pytest --cov=taintmonkey --cov-report html tests/
cd htmlcov/
python3 -m http.server

The HTML report generated by coverage-py should be available at http://localhost:8000.

Experiments

In order to run experiments using the JungleGym dataset, make sure to set up the environment by doing the following.

python3 -m venv venv
source venv/bin/activate
bash experiments/setup.sh

Authors

TaintMonkey was developed by Shayan Chatiwala, Aiden Chen, Carter Chew, Sebastian Mercado, and Aarav Parikh for GSET 2025. The project was advised by Benson Liu as their project mentor and Anusha Iyer as their project Residential Teaching Assistant (RTA). For any questions or requests for additional information, please contact the authors.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

taintmonkey-1.0.2.tar.gz (22.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

taintmonkey-1.0.2-py3-none-any.whl (17.5 kB view details)

Uploaded Python 3

File details

Details for the file taintmonkey-1.0.2.tar.gz.

File metadata

  • Download URL: taintmonkey-1.0.2.tar.gz
  • Upload date:
  • Size: 22.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for taintmonkey-1.0.2.tar.gz
Algorithm Hash digest
SHA256 f802996da2d8abf41878f7f02266c2af2e293df749482f2c4b5c15be5894323a
MD5 843c25bbb83bb6f31deb5ef2d0a412ea
BLAKE2b-256 f55f6a949e80118bad33d94ef43f9c957f583e8fb4ff43f8d22ea9fc0918166c

See more details on using hashes here.

File details

Details for the file taintmonkey-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: taintmonkey-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 17.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for taintmonkey-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 b6ca12d8204c2de9e3c105ad34e7d34ea386a556db8aea7b21b173923b1c4d3c
MD5 79c441d5b6924f9825b21cf01cccd78b
BLAKE2b-256 72e5655302000df58a36485beb1dd4e7699e76d4cf5a635b542febc2c21184ae

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page