Security evaluation harness for OpenClaw agents - powered by Tinman

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

oliveskin

These details have not been verified by PyPI

Project description

Tinman OpenClaw Eval

Security evaluation harness for OpenClaw agents. Powered by Tinman.

Features

280+ attack probes (currently 288) across 12 categories
Synthetic Gateway for isolated testing
CI integration via SARIF, JUnit, and JSON outputs
Baseline assertions for regression testing
Real-time monitoring via Gateway WebSocket

Attack Categories

Run tinman-eval list-attacks to see exact counts by category.

Category	Description
Prompt Injection	Jailbreaks, instruction override, prompt leaking
Tool Exfiltration	Sensitive file/secret exfiltration attempts
Context Bleed	Cross-session leaks, conversation history extraction
Privilege Escalation	Sandbox escape, elevation bypass attempts
Supply Chain	Malicious skills, dependency and update attacks
Financial Transaction	Wallet/seed phrase theft, transaction/approval attempts
Unauthorized Action	Actions without consent/confirmation
MCP Attacks	MCP tool abuse, server injection, cross-tool exfil
Indirect Injection	Injection via documents, URLs, issues, logs, metadata
Evasion Bypass	Unicode/encoding bypass, obfuscation, injection variants
Memory Poisoning	Persistent instruction poisoning, fabricated history
Platform Specific	OS and cloud-specific payloads (Windows/macOS/Linux/metadata)

Installation

pip install tinman-openclaw-eval

Or from source:

git clone https://github.com/oliveskin/tinman-openclaw-eval
cd tinman-openclaw-eval
pip install -e ".[dev]"

Quick Start

# Run all attacks (mock gateway)
tinman-eval run

# Run specific category
tinman-eval run -c prompt_injection
tinman-eval run -c financial
tinman-eval run -c evasion_bypass

# Run only high severity (S3+)
tinman-eval run -s S3

# Save report
tinman-eval run -o report.md

# List all attacks
tinman-eval list-attacks

# Run single attack
tinman-eval run-single PI-001 -v

Category aliases are supported (e.g. financial, mcp_attacks, supplychain, platform).

CI Integration

GitHub Actions

name: Security Eval
on: [push, pull_request]

jobs:
  security:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: '3.11'

      - run: pip install tinman-openclaw-eval

      - name: Run security evaluation
        run: |
          tinman-eval run \
            --output security-report.json \
            --format json

      - name: Assert baseline
        run: |
          tinman-eval assert-cmd \
            security-report.json \
            --baseline expected/baseline.json

      - name: Upload SARIF
        uses: github/codeql-action/upload-sarif@v2
        with:
          sarif_file: security-report.sarif
        if: always()

      - name: Generate SARIF (always)
        run: |
          tinman-eval run \
            --output security-report.sarif \
            --format sarif
        if: always()

Generate Baseline

# Create initial baseline
tinman-eval baseline --output expected/baseline.json

# Update after intentional changes
tinman-eval run -f json -o new-results.json
# Review and approve
mv new-results.json expected/baseline.json

Output Formats

Format	Use Case
`markdown`	Human-readable reports
`json`	Programmatic analysis
`sarif`	GitHub Code Scanning
`junit`	CI test results

Custom Attacks

Add a new attack module under src/tinman_openclaw_eval/attacks/:

from tinman_openclaw_eval.attacks.base import Attack, AttackCategory, AttackPayload, ExpectedBehavior, Severity


class MyAttacks(Attack):
    category = AttackCategory.PROMPT_INJECTION
    name = "My Attacks"

    def _load_payloads(self) -> None:
        self.payloads.append(
            AttackPayload(
                id="MY-001",
                name="My probe",
                category=self.category,
                severity=Severity.S2,
                payload="...",
                target="dm_channel",
                expected_behavior=ExpectedBehavior.REJECTED_BY_SOUL,
            )
        )

Then export it from src/tinman_openclaw_eval/attacks/__init__.py and register it in src/tinman_openclaw_eval/harness.py.

Programmatic Usage

import asyncio
from tinman_openclaw_eval import EvalHarness, AttackCategory

async def main():
    harness = EvalHarness()

    # Run all attacks
    result = await harness.run()

    # Check for vulnerabilities
    print(f"Vulnerabilities: {result.vulnerabilities}")

    # Run specific categories
    result = await harness.run(categories=[
        AttackCategory.PROMPT_INJECTION,
        AttackCategory.FINANCIAL_TRANSACTION,
        AttackCategory.EVASION_BYPASS,
    ])

    # Run high severity only
    result = await harness.run(min_severity="S3")

asyncio.run(main())

Testing Against Real Gateway

# Connect to local OpenClaw Gateway
tinman-eval run --no-mock --gateway-url ws://127.0.0.1:18789

# With custom config
tinman-eval run --no-mock --gateway-url ws://192.168.1.100:18789

Attack Probe IDs

Prefix	Category
`PI-*`	Prompt Injection
`TE-*`	Tool Exfiltration
`CB-*`	Context Bleed
`PE-*`	Privilege Escalation
`SC-*`	Supply Chain
`FT-*`	Financial Transaction
`UA-*`	Unauthorized Action
`MCP-*`	MCP Attacks
`II-*`	Indirect Injection
`EB-*`	Evasion Bypass
`MP-*`	Memory Poisoning
`PS-*`	Platform Specific

Severity Levels

Level	Description	Action
S4	Critical	Immediate fix required
S3	High	Fix before deploy
S2	Medium	Review recommended
S1	Low	Monitor
S0	Info	Observation only

Integration with OpenClaw Skill

For continuous monitoring in OpenClaw, use the Tinman Skill:

# In OpenClaw
/tinman sweep                    # Run security sweep
/tinman sweep --category financial
/tinman watch                    # Real-time monitoring

License

Apache-2.0

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

oliveskin

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.3.2

Feb 8, 2026

0.3.1

Feb 1, 2026

0.3.0

Feb 1, 2026

0.2.0

Jan 31, 2026

0.1.2

Jan 31, 2026

0.1.1

Jan 31, 2026

0.1.0

Jan 31, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tinman_openclaw_eval-0.3.2.tar.gz (58.3 kB view details)

Uploaded Feb 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tinman_openclaw_eval-0.3.2-py3-none-any.whl (67.6 kB view details)

Uploaded Feb 8, 2026 Python 3

File details

Details for the file tinman_openclaw_eval-0.3.2.tar.gz.

File metadata

Download URL: tinman_openclaw_eval-0.3.2.tar.gz
Upload date: Feb 8, 2026
Size: 58.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for tinman_openclaw_eval-0.3.2.tar.gz
Algorithm	Hash digest
SHA256	`e8c0af844744b76e1dedba3114cc5f0ff18ad5cce47f4b24b300b8c962eb3ce2`
MD5	`2ef3b8e3eed63fb96d62835904599391`
BLAKE2b-256	`f48394d2c3b62d11650557259d8a31d181ddeb801c7a6f4f811d599a22b0d081`

See more details on using hashes here.

Provenance

The following attestation bundles were made for tinman_openclaw_eval-0.3.2.tar.gz:

Publisher: publish.yml on oliveskin/tinman-openclaw-eval

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: tinman_openclaw_eval-0.3.2.tar.gz
- Subject digest: e8c0af844744b76e1dedba3114cc5f0ff18ad5cce47f4b24b300b8c962eb3ce2
- Sigstore transparency entry: 927244718
- Sigstore integration time: Feb 8, 2026
Source repository:
- Permalink: oliveskin/tinman-openclaw-eval@7e5e9bd853b2338db55007216c96800c40c96dc4
- Branch / Tag: refs/heads/main
- Owner: https://github.com/oliveskin
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@7e5e9bd853b2338db55007216c96800c40c96dc4
- Trigger Event: workflow_dispatch

File details

Details for the file tinman_openclaw_eval-0.3.2-py3-none-any.whl.

File metadata

Download URL: tinman_openclaw_eval-0.3.2-py3-none-any.whl
Upload date: Feb 8, 2026
Size: 67.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for tinman_openclaw_eval-0.3.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3f3bb0554d74601333937957ab2aeb0172bc1d07117aafe885d83ebd1f42002c`
MD5	`45d71a2e799596dca72dacaa6877ded2`
BLAKE2b-256	`801e9450743801d4b0c57ceed222641561a59d536b5c05e8d443de55b58c6ab0`

See more details on using hashes here.

Provenance

The following attestation bundles were made for tinman_openclaw_eval-0.3.2-py3-none-any.whl:

Publisher: publish.yml on oliveskin/tinman-openclaw-eval

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: tinman_openclaw_eval-0.3.2-py3-none-any.whl
- Subject digest: 3f3bb0554d74601333937957ab2aeb0172bc1d07117aafe885d83ebd1f42002c
- Sigstore transparency entry: 927244722
- Sigstore integration time: Feb 8, 2026
Source repository:
- Permalink: oliveskin/tinman-openclaw-eval@7e5e9bd853b2338db55007216c96800c40c96dc4
- Branch / Tag: refs/heads/main
- Owner: https://github.com/oliveskin
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@7e5e9bd853b2338db55007216c96800c40c96dc4
- Trigger Event: workflow_dispatch

tinman-openclaw-eval 0.3.2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Tinman OpenClaw Eval

Features

Attack Categories

Installation

Quick Start

CI Integration

GitHub Actions

Generate Baseline

Output Formats

Custom Attacks

Programmatic Usage

Testing Against Real Gateway

Attack Probe IDs

Severity Levels

Integration with OpenClaw Skill

Links

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance