Deterministic scanner for code-executing chat templates inside model files (GGUF / Ollama / Hugging Face).

These details have not been verified by PyPI

Project links

Project description

GlyphHound

A deterministic scanner that detects code-executing chat templates inside model files (GGUF / Ollama / Hugging Face), before you load the model.

The problem

When you download an open-weights LLM, the file ships with a chat template: a small Jinja2 program that formats your messages before the model sees them. Several runtimes render that template with an unsandboxed Jinja engine, so a malicious template can run code on your machine the moment the model is loaded, before you type anything. This is a real, patched bug class:

CVE-2024-34359 (llama-cpp-python)
CVE-2026-5760 (SGLang, CVSS 9.8 - RCE via an unsandboxed jinja2.Environment() rendering a malicious tokenizer.chat_template)

What GlyphHound does

It reads the chat template out of the model file without downloading the multi-gigabyte weights, parses it into a syntax tree, traces whether it can reach a code-execution operation, optionally confirms it in a locked-down sandbox, and reports the result with a CI exit code.

It is a program-analysis tool, not a model: the same input always produces the same finding, there are no API calls or LLM calls at scan time, and it runs offline.

Pipeline (five stages):

Acquire: extract the template via an HTTP range request over the GGUF metadata header, the local Ollama blob, or a Hugging Face repo's tokenizer_config.json / chat_template.jinja / safetensors metadata. A hard cap on the bytes read, plus a refusal of any server that ignores the range request, keeps the fetch far smaller than the file and never touches the weights.
Parse: to a Jinja2 AST (no rendering).
De-obfuscate: fold obfuscation back to the identifier it hides, including string concatenation, str.format / % / |format printf, slices, |join / |replace, case-changing filters, string repetition, {% set %} constant propagation, and getattr / |attr reflection.
Analyze: walk the AST for code-execution sinks (dunder chains into the Python object model, code-exec names, reflection) and flag one only when a dangerous expression actually reaches it (taint / reachability), not when a benign template merely names a variable.
Report: human, JSON, and SARIF 2.1.0, with a configurable severity threshold that drives the exit code. An optional, off-by-default sandbox stage renders the template in a contained subprocess to confirm a finding.

Example

These examples scan template files from this repository (clone it first), or substitute your own template file, or scan a model by id with glyphhound scan owner/name.

Scanning a malicious template (an obfuscated __import__ to os.system chain). It gates CI with a non-zero exit code:

$ glyphhound scan fixtures/malicious/cve_2024_34359_marker.jinja
GlyphHound scan report
======================
threshold: fail CI on reachable findings of severity >= high
exit code: 1

findings (5):
  [GH-S002] CRITICAL reachable     tokenizer.chat_template:17  [GATES CI]
      code-exec-name: .system
      reason: reference to a code-execution or dangerous-capability name (eval/exec/compile/os/subprocess/importlib/pickle/open/...)
  [GH-S001] CRITICAL reachable     tokenizer.chat_template:17  [GATES CI]
      dunder-attribute: .__import__
      reason: attribute/subscript/|attr access to a Python dunder used for sandbox escape
  ... (3 more reachable dunder findings: .__builtins__, .__globals__, .__init__)

summary: 5 finding(s), 5 reachable; critical=5 high=0; 5 gating -> exit 1

Scanning a real, benign template. It stays quiet and passes:

$ glyphhound scan fixtures/benign/Qwen__Qwen2.5-0.5B-Instruct-GGUF.jinja
GlyphHound scan report
======================
threshold: fail CI on reachable findings of severity >= high
exit code: 0

findings: 0 (nothing flagged at or above the detection threshold)

summary: 0 finding(s), 0 reachable; critical=0 high=0; 0 gating -> exit 0

How it compares

Promptfoo's ModelAudit already ships a pip-installable, SARIF-emitting chat-template scanner based on string/regex matching. GlyphHound's contribution is narrow and specific: AST + taint + de-obfuscation catches obfuscated payloads that string matching misses. That difference is measured, not asserted. See benchmark/:

On the obfuscated payload set, GlyphHound flags 9 / 9; ModelAudit flags 3 / 9 (in its strongest configuration). Both produce 0 false positives on the benign controls.

This is an engineering artifact, not a research claim. It does not catch everything, and it is not the only tool in this space.

False-positive rate

A scanner is only as useful as how quiet it stays on safe templates, so this is measured on real, benign chat templates rather than asserted:

0 / 120 on a vendored corpus of distinct real Hugging Face chat templates (corpus/).
0 / 241 on a separate wider audit of additional real templates (study/wider_fp_audit.json).

These are measured rates on those specific sets, not a guarantee that no template will ever be misflagged. Both are reproducible offline with scripts/verify_phase7.py.

Independent validation

Beyond its own fixtures, GlyphHound has been checked against real, third-party attack payloads and real production chat templates:

23 / 23 real Jinja2 remote-code-execution payloads detected (100% recall). The payloads are taken verbatim from two widely used public references: PayloadsAllTheThings and HackTricks.
0 / 16 false positives on chat templates pulled live from popular public Hugging Face models (100% precision on that set).

These public payloads are largely unobfuscated, where string-matching scanners also do well; GlyphHound's specific advantage on obfuscated payloads is the separate measurement in benchmark/. The full method, the exact payloads, the model list, and the limitations are in VALIDATION.md.

Reproduce it yourself. The script lives in this repository, so run it from a checkout (static analysis only, nothing is rendered or executed):

git clone https://github.com/ydarwish1/glyphhound
cd glyphhound
pip install -e .
python scripts/verify_real_payloads.py

Install

pip install glyphhound

Or from source (for development, the test suite, or the reproduction scripts):

git clone https://github.com/ydarwish1/glyphhound
cd glyphhound
python -m venv .venv
. .venv/bin/activate          # Windows: .venv\Scripts\activate
pip install -e ".[dev]"

The runtime dependency is jinja2 only; pytest and jsonschema are dev-only.

Usage

# Scan a local GGUF, a .gguf URL, a Hugging Face repo, or an Ollama model:
python -m glyphhound scan path/to/model.gguf
python -m glyphhound scan owner/name                      # canonical HF template (no weights)
python -m glyphhound scan owner/name --file auto          # smallest .gguf quant in the repo
python -m glyphhound scan owner/name --file model.Q4.gguf
python -m glyphhound scan ollama-model-name
python -m glyphhound scan template.jinja                  # a local template file
cat template.jinja | python -m glyphhound scan -          # stdin

Options:

--format human|json|sarif      output format (default: human)
--threshold critical|high      minimum severity that gates CI (default: high)
--confirm                      render in the locked-down sandbox to confirm a finding
--revision <sha>               pin a Hugging Face commit for reproducibility

Set HF_TOKEN for gated/private repos and higher Hub rate limits.

Exit codes (for CI): 0 = clean, 1 = a reachable finding gates the build, 2 = the scan could not run.

Testing and verification

pip install -e ".[dev]"
python -m pytest                       # offline test suite
python scripts/verify_phase2.py        # per-stage verification scripts (verify_phase*.py)

The verify_phase*.py scripts re-prove each stage with real output: a flagged fixture, a schema-valid SARIF file, the measured false-positive rate, the head-to-head benchmark, and the sandbox containment proof. A few require network (verify_phase0/9/14) or a separate ModelAudit environment (verify_phase8); the rest are offline.

Safety

MARKER payloads only. Test fixtures simulate the attack chain, but the "payload" is a harmless sentinel; there is no working exploit or poisoned model in this repository.
Never loads weights. The acquirer fetches only the metadata that holds the template.
The sandbox contains, or stays off. The optional --confirm stage renders a template only inside a locked-down subprocess (a sys.addaudithook policy that blocks network, process spawn, ctypes, and out-of-scratch / symlink / hardlink writes; on Linux it adds a seccomp syscall filter, resource limits, and privilege-drop). It does not block host-file reads or deletions; blocking network egress is what prevents a read-then-exfiltrate. Containment is tested. It is a best-effort sandbox, not a formally verified jail. See ARCHITECTURE.md.

Documents

ARCHITECTURE.md: the five-stage pipeline, with the exact input/output of each stage.
CHANGELOG.md: the full build history, stage by stage.
benchmark/: the head-to-head methodology, payloads, and the reproducible yardstick (benchmark/RELEASE.md).
study/wider_fp_audit.json: the wider false-positive audit behind the 0/241 figure.
action/: the GitHub Action wrapper that runs a scan in CI and uploads SARIF to code scanning.
SECURITY.md: how to report a vulnerability. CONTRIBUTING.md: how to build and test.

License

Apache-2.0. See LICENSE. Third-party attribution (Jinja2, the vendored SARIF schema, and the benign template corpus) is in NOTICE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Jun 24, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

glyphhound-0.1.0.tar.gz (106.2 kB view details)

Uploaded Jun 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

glyphhound-0.1.0-py3-none-any.whl (73.8 kB view details)

Uploaded Jun 24, 2026 Python 3

File details

Details for the file glyphhound-0.1.0.tar.gz.

File metadata

Download URL: glyphhound-0.1.0.tar.gz
Upload date: Jun 24, 2026
Size: 106.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for glyphhound-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`cdbf99450096d28f7ba1bfbec6ad21d5751b66c4a8d72ea45a8d49ce0f4a0f71`
MD5	`8c9b8c4247c53ed96a5ce2e830942615`
BLAKE2b-256	`ab49679bace93e956786c68a48138393bfea80b58c9ab594d9c6cbd4b3ad3779`

See more details on using hashes here.

File details

Details for the file glyphhound-0.1.0-py3-none-any.whl.

File metadata

Download URL: glyphhound-0.1.0-py3-none-any.whl
Upload date: Jun 24, 2026
Size: 73.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for glyphhound-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1c466b5a03f29ae2175a5b3f2f2fea3566ad71d2f007dada5ee252f38ff9d276`
MD5	`3ed80e1bc1a5c5b1b6dd7502c0f21573`
BLAKE2b-256	`3bbe9e9afdc7efc61bdc7517f04d72c16db7e802dcd91662e158e44898734cca`

See more details on using hashes here.

glyphhound 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

GlyphHound

The problem

What GlyphHound does

Example

How it compares

False-positive rate

Independent validation

Install

Usage

Testing and verification

Safety

Documents

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes