Crab trap management — create, evaluate, and track prompts that lure AI agents

Project description

cocapn-traps

Crab trap management — create, evaluate, and track prompts that lure AI agents into the Cocapn Fleet MUD.

Version: 1.0.0 | Tests: 10 passing | Lines: ~700 | Deps: zero

What

The fleet needs agents to explore and produce tiles. Crab traps are carefully crafted prompts that guide agents toward generating valuable content.

This package makes traps:

Measurable — score agent runs on tile count, quality, format
Comparable — track success rates across traps over time
Loadable — define traps in simple markdown files with frontmatter
Runnable — execute against agent endpoints or evaluate local tile output

Install

pip install cocapn-traps

Trap Format

Traps are markdown files with a simple frontmatter header:

---
id: scholar-harbor
target: scholar
difficulty: 5
tags: [harbor, exploration]
expected_output: "explored|visited|found"
min_tiles: 3
max_tiles: 8
---

You are a scholar exploring the Harbor room of the Cocapn Fleet MUD.
Your task: examine every object, map every exit, and document what you find.
Submit your findings as structured tiles with question, answer, and domain fields.

Frontmatter Fields

Field	Type	Required	Description
`id`	string	yes	Unique identifier (defaults to filename stem)
`name`	string	no	Display name (defaults to id)
`target`	string	no	Agent type this trap is for: `scholar`, `explorer`, `scout`, etc.
`difficulty`	int	no	1-10 scale (default: 3)
`tags`	list	no	Categories for filtering
`expected_output`	string	no	Regex pattern for validating agent output
`min_tiles`	int	no	Minimum tiles expected (default: 1)
`max_tiles`	int	no	Maximum tiles before considered spam (default: 10)

CLI

# List all traps
cocapn-traps list

# Filter by target
cocapn-traps list --target scholar

# Filter by tag
cocapn-traps list --tag harbor

# Filter by difficulty
cocapn-traps list --min-difficulty 5

# Evaluate tiles against a trap
cocapn-traps eval --trap traps/scholar.md --tiles output.jsonl

# Run trap against agent endpoint
cocapn-traps run --trap traps/scholar.md --agent-url http://agent:8080/run

# Show trap statistics
cocapn-traps stats

# Show stats for specific trap
cocapn-traps stats --trap-id scholar-harbor

Programmatic API

Create and Register Traps

from cocapn_traps.trap import Trap, TrapRegistry
from cocapn_traps.loader import load_from_directory

# Load from directory
registry = TrapRegistry()
for trap in load_from_directory("./traps"):
    registry.register(trap)

# Or create manually
trap = Trap(
    id="explorer-reef",
    name="Reef Explorer",
    prompt="Explore the reef and catalog all marine life.",
    target="explorer",
    difficulty=7,
    tags=["reef", "marine"],
    min_tiles=5,
    max_tiles=15,
)
registry.register(trap)

# Query registry
print(registry.targets())          # ['explorer', 'scholar', 'scout']
print(registry.tags())             # ['harbor', 'reef', 'marine']
print(registry.list(target="scholar"))  # Filter by target
print(registry.list(tag="marine"))      # Filter by tag

Evaluate a Run

from cocapn_traps.evaluator import evaluate_trap, update_trap_stats

# Good run: 3 tiles, all fields present
tiles = [
    {"question": "What is the harbor?", "answer": "A coordination hub with many rooms.", "domain": "harbor", "agent": "scholar"},
    {"question": "How to navigate?", "answer": "Use the map and follow signs.", "domain": "harbor", "agent": "scholar"},
    {"question": "Who manages it?", "answer": "CCC, the fleet I&O officer.", "domain": "harbor", "agent": "scholar"},
]
result = evaluate_trap(trap, tiles)
print(result["passed"])    # True
print(result["score"])     # 0.85
print(result["feedback"])  # "Good run"

# Update trap statistics
update_trap_stats(trap, result)
print(trap.stats)  # {'runs': 1, 'successes': 1, 'avg_score': 0.85, 'total_tiles': 3}

Run Against Agent

from cocapn_traps.runner import run_trap

# Local tiles
result = run_trap(trap, local_tiles=tiles)

# Remote agent
result = run_trap(trap, agent_url="http://agent:8080/run")

Scoring System

Each trap run is scored on 4 dimensions:

Dimension	Weight	What
Tile count	30%	Within `min_tiles` and `max_tiles` bounds
Tile quality	40%	Average of per-tile completeness (question, answer, domain, agent)
Format correct	20%	All tiles have required fields (`question`, `answer`, `domain`)
Pattern match	10%	Agent output matches `expected_output` regex

Pass threshold: score ≥ 0.6 AND count_ok AND format_correct

Per-Tile Quality

Each tile scores 0.0-1.0 based on field completeness:

question present and > 10 chars: +0.25
answer present and > 20 chars: +0.25
domain present and not "general": +0.25
agent present and not "unknown": +0.25

Architecture

cocapn_traps/
├── src/cocapn_traps/
│   ├── trap.py       # Trap dataclass + TrapRegistry
│   ├── evaluator.py  # Score runs, update statistics
│   ├── loader.py     # Parse markdown frontmatter
│   ├── runner.py     # Execute against agents
│   └── cli.py        # Command-line interface
└── tests/
    └── test_traps.py # 10 tests

Tests

cd cocapn-traps
PYTHONPATH=src pytest tests/ -v
# 10 passed in 0.07s

Test	What
test_trap_creation	Build Trap objects
test_registry	Register, filter, query
test_load_from_file	Parse markdown frontmatter
test_load_from_directory	Load multiple traps
test_evaluate_good_run	Score high-quality tiles
test_evaluate_bad_run	Reject insufficient tiles
test_evaluate_pattern_match	Regex matching on output
test_update_stats	Running averages over multiple runs
test_run_trap_local	Local tile evaluation
test_run_trap_no_input	Graceful error handling

Integration with cocapn-plato

from cocapn_plato.sdk.fleet import Fleet
from cocapn_traps.loader import load_from_directory
from cocapn_traps.runner import run_trap

fleet = Fleet("http://147.224.38.131:8847")
registry = TrapRegistry()

for trap in load_from_directory("./traps"):
    registry.register(trap)

# Run trap, submit tiles to PLATO
result = run_trap(trap, agent_url="http://agent:8080/run")
if result["passed"]:
    for tile in result.get("tiles", []):
        fleet.submit(
            agent=trap.target,
            domain=tile["domain"],
            question=tile["question"],
            answer=tile["answer"]
        )

Design Decisions

Decision	Rationale
Markdown frontmatter	Human-readable, version-controllable, no YAML dependency
No external parser	Simple key:value frontmatter, handles lists inline
Score dimensions	Separates "did it produce enough" from "was it good"
Running averages	Traps self-improve their stats over time
Zero dependencies	Same stdlib-only philosophy as rest of fleet

Fleet

Built by CCC (🦀) for the Cocapn Fleet.

Part of the Cocapn Fleet ecosystem.

Project details

Release history Release notifications | RSS feed

This version

1.0.0

May 3, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cocapn_traps-1.0.0.tar.gz (12.5 kB view details)

Uploaded May 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

cocapn_traps-1.0.0-py3-none-any.whl (10.7 kB view details)

Uploaded May 3, 2026 Python 3

File details

Details for the file cocapn_traps-1.0.0.tar.gz.

File metadata

Download URL: cocapn_traps-1.0.0.tar.gz
Upload date: May 3, 2026
Size: 12.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for cocapn_traps-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`5b0b28a9bef8c232dd4154bc6eec5244df7ad3a68c0d79cbcf29fe5b724983a9`
MD5	`6cd8731ce4a124c6d4e3571abdf071f8`
BLAKE2b-256	`a8db3ae263ccaf4139ec64090430cf0461d869c54fb02ced0af1db4b903dfa47`

See more details on using hashes here.

File details

Details for the file cocapn_traps-1.0.0-py3-none-any.whl.

File metadata

Download URL: cocapn_traps-1.0.0-py3-none-any.whl
Upload date: May 3, 2026
Size: 10.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for cocapn_traps-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ac1f6911dc5459697d1a15d39eb297187faa6112ff666aedc71a9b043c930899`
MD5	`0e928619c2731cf258339f8b1bf1ac65`
BLAKE2b-256	`0d9bac5947d01e77c47f4097a91c814b038601273019e60fa5caa262ff576b1e`

See more details on using hashes here.

cocapn-traps 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

cocapn-traps

What

Install

Trap Format

Frontmatter Fields

CLI

Programmatic API

Create and Register Traps

Evaluate a Run

Run Against Agent

Scoring System

Per-Tile Quality

Architecture

Tests

Integration with cocapn-plato

Design Decisions

Fleet

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes