Skip to main content

PORIFERA: PHP code rewriting instrumentation for expression in AST

Project description

PORIFERA: Php cOde Rewriting Instrumentation For ExpRession in Ast

Porifera (Sponges) are the oldest porous filter-feeding organisms in the ocean. Sponges draw in seawater through micro-pores throughout their bodies, filter it, and capture nutrients. PORIFERA uses this as a metaphor: Like the pores of a sponge, it permeates each expression in the code, silently absorbing runtime values without altering the behavior of the program itself.

PHP AST instrumentation library. Injects runtime probe wrappers around target PHP expressions to capture their values, then restores the original code.

Requires Python >=3.11,<3.13.

Installation

pip install porifera

Quick Start

from php_parser_py import Parser
from porifera import instrument, deinstrument

# Parse PHP project
ast = Parser().parse_file("path/to/project/index.php")

# Define targets: AST node_id -> expr_key label
targets = {
    "node_42": "user_query",
    "node_87": "config_value",
}

# Instrument — wraps targets with probe calls
modified_files = instrument(targets, ast)

# Optionally specify output directory for probe logs
# modified_files = instrument(targets, ast, output_dir=Path("/tmp/logs"))

# Run PHP project... probes log to .porifera_data_<timestamp>.jsonl

# Deinstrument — restores original code
restored_files = deinstrument(ast)

After instrumentation, a PHP expression like:

$row = $db->query($sql);

becomes:

$row = __porifera_probe_a1b2c3d4("user_query", $db->query($sql));

The probe returns the original value transparently and logs it to .porifera_data_<timestamp>.jsonl with key, value, value_type, and ts fields. Each instrumentation run produces a unique timestamped file to avoid overwriting previous data.

Examples

Log all array access values at runtime

Probe every $config['db_host'], $row['name'], etc.:

from php_parser_py import Parser
from porifera import instrument

ast = Parser().parse_file("app/config.php")

# All Expr_ArrayDimFetch nodes -> targets
fetches = ast.nodes(lambda n: n.node_type == "Expr_ArrayDimFetch")
targets = {n.id: f"array_{n.id[:8]}" for n in fetches}

instrument(targets, ast)

Log all function call return values

Probe every function call:

from php_parser_py import Parser
from porifera import instrument

ast = Parser().parse_file("app/service.php")

# All Expr_FuncCall nodes -> targets
calls = ast.nodes(lambda n: n.node_type == "Expr_FuncCall")
targets = {n.id: f"call_{n.id[:8]}" for n in calls}

instrument(targets, ast)

Log specific function call return values

Probe only mysqli_query(...) calls:

from php_parser_py import Parser
from porifera import instrument

ast = Parser().parse_file("app/db.php")

# All Expr_FuncCall nodes
calls = ast.nodes(lambda n: n.node_type == "Expr_FuncCall")

# Filter by function name
targets = {}
for call in calls:
    name_node = next(ast.succ(call, lambda e: e.get("field") == "name"), None)
    if not name_node or name_node.get_property("parts") != ["mysqli_query"]: continue
    targets[call.id] = f"query_{call.id[:8]}"

instrument(targets, ast)

Broader coverage with ElevatingProbeStrategy

StandardProbeStrategy skips lvalue targets (e.g. $i in $i++). ElevatingProbeStrategy wraps the nearest safe ancestor instead:

from porifera import instrument, ElevatingProbeStrategy

instrument(targets, ast, strategy=ElevatingProbeStrategy())

Documentation

  • Design Specification — API signatures, class responsibilities, strategies, exceptions, and validation rules
  • Implementation Notes — php-parser-py API reference, AST node types, unsafe wrap contexts, and developer instructions
  • AST Structure - the structure of the php-parser-py AST, node types, and properties, and how to navigate it.

Development

# Install dependencies
uv sync

# Run tests
uv run pytest

# Type checking
uv run mypy src/

# Linting
uv run pylint src/

# Formatting
uv run black src/ tests/
uv run isort src/ tests/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

porifera-0.1.1.tar.gz (11.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

porifera-0.1.1-py3-none-any.whl (16.4 kB view details)

Uploaded Python 3

File details

Details for the file porifera-0.1.1.tar.gz.

File metadata

  • Download URL: porifera-0.1.1.tar.gz
  • Upload date:
  • Size: 11.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.12

File hashes

Hashes for porifera-0.1.1.tar.gz
Algorithm Hash digest
SHA256 cd2f06f5a7b8ac42a5353c4f1165a2c9283a4ad0169d95019a79fe10086cac62
MD5 988bc0af27c814939b755822196f3a54
BLAKE2b-256 62f5024e6989461f0922fbc539f98a3838442b5b65fdcfa702be66c264e151bc

See more details on using hashes here.

File details

Details for the file porifera-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: porifera-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 16.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.12

File hashes

Hashes for porifera-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c924e41442ee72260a388130ce826eeab7b4c1119ff0f65421a7338a26af79fc
MD5 375d66eb4c8220824e28687d739a7a18
BLAKE2b-256 27962ced06e1b9e1838f29eacd4ff33b48c31f9b05220db9f7c537ec07785d9a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page