Online deadlock-breaker — injects targeted engineering heuristics from rosclaw-know assets into stalling agents.
Project description
ROSClaw-How
Online deadlock-breaker for agents stuck on engineering-optimization tasks, with a feedback loop that lets the assets refine themselves over time.
Sister project: rosclaw-know (offline refinery that produces the
bridge_index.json + code_patterns/ assets this service serves at runtime,
and consumes the outcome JSONL this service exports to drive the next
publish cycle).
What it does
When an agent's verifier score plateaus or a physical safety symptom appears in its error log, this service injects a small, targeted hint into its next prompt. Three strategies, decided server-side:
| Strategy | Trigger | Returned payload |
|---|---|---|
SAFETY |
Error log mentions a safety symptom | Hard-coded constraint (~50–100 tokens) |
FREE_EXPLORATION |
First 3 iterations, or score improving | Empty string — keep exploring |
CATALYST |
Score plateau / regression | Cross-domain analogy + diff (≤ 400 tokens) |
Runtime is pure rules + a single vector lookup — zero LLM calls. The
CATALYST path returns an injection_id so the agent can later report whether
the hint helped (POST /wiki/v1/prompt/feedback); the resulting outcomes drive
per-pattern uplift statistics and soft-deprecation of under-performing
patterns.
Feedback loop (the “push + learn” cycle)
┌─────────────────────────────────────────┐
│ rosclaw-know (offline refinery) │
│ awesome_fetcher → new raw corpus │
│ active_learning → autodraft → ingest │
│ feedback_distill.py → pattern_metrics │
│ bridge_reweighter → priority=-1 │
│ │
│ writes bridge_index.json (+priority) │
└───────────────────┬─────────────────────┘
│ asset publish
▼
┌─────────────────────────── rosclaw-how ───────────────────────────┐
│ asset_loader delta-sync bridge_index → SeekDB │
│ SemanticRouter skips clusters with priority < 0 │
│ │
│ POST /wiki/v1/prompt/build → snippet + injection_id │
│ POST /wiki/v1/prompt/feedback → post_score, delta_score │
│ GET /wiki/v1/stats → bucketed uplift / win_rate │
│ GET /wiki/v1/blind_spots → recurring Unknown_Error gaps │
│ GET /wiki/v1/outcomes/export → NDJSON stream for offline pipe │
│ POST /wiki/v1/admin/reload → hot-reload assets │
│ POST /wiki/v1/admin/promote → maturity gate (staging→prod) │
└────────────────────────────────────────────────────────────────────┘
│ NDJSON export
▼
┌─────────────────────────────────────────┐
│ rosclaw-know data/exports/*.jsonl │
│ distill_feedback.py → re-publish ↻ │
└─────────────────────────────────────────┘
Closed-loop validation: 6/6 stuck-rollout scenarios pass the replay benchmark
(scripts/replay_benchmark.py on the rosclaw-know side) — bad patterns get
priority=-1, vanish from the next CATALYST lookup, and good patterns keep
their slot.
Quick start
cd rosclaw-how
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
cp .env.example .env
# (optional) symlink the assets from a finished rosclaw-know run:
ln -s ../../rosclaw-know/data/assets data/assets
# Run tests
pytest -q
# Start the server
python scripts/run_server.py
# → POST http://localhost:47820/wiki/v1/prompt/build
Deploying on a memory-constrained host
pyseekdb embedded mode boots a small OceanBase-like observer in-process
(~1.5–2 GB RAM after warmup, plus a 4 GB datafile reservation on disk).
If you are running on a host where another embedded SeekDB instance is
already up (e.g., the legacy rosclaw-wiki service), or where the box is
too small to fit a second observer, switch to server mode:
# Or use a remote cluster
SEEKDB_MODE=server
SEEKDB_HOST=10.0.0.5
SEEKDB_PORT=2881
SEEKDB_TENANT=sys # OceanBase tenant name
SEEKDB_USER=root
SEEKDB_PASSWORD=…
The SAFETY and FREE_EXPLORATION paths never touch SeekDB, so they remain
available even when the database is unreachable. The CATALYST path falls
back to InMemoryRouter (numpy cosine over bridge_index.json) when
ROSCLAW_HOW_ROUTER_BACKEND=inmemory is set — useful when SeekDB is down
or absent.
Auto-create database
On first boot, seekdb_client._ensure_database_exists runs an idempotent
CREATE DATABASE IF NOT EXISTS <SEEKDB_DATABASE> via pyseekdb.AdminClient
before opening the data-plane client. This means a fresh embedded SeekDB
(which only ships the test database by default) bootstraps cleanly with
SEEKDB_DATABASE=rosclaw_how without manual setup.
API
All endpoints share the /wiki/v1 prefix kept for compatibility with the
legacy rosclaw-wiki API.
POST /wiki/v1/prompt/build
Auth: X-API-Key.
Request:
{
"error_log": "ERROR: torque overflow on joint 2",
"previous_scores": [0.42, 0.47, 0.47, 0.46],
"current_iteration": 4
}
Response on a CATALYST hit (the injection_id is the handle for the
follow-up feedback call):
{
"prompt_snippet": "## 🔧 Engineering Heuristics from ROSClaw-How ...",
"injected": true,
"strategy": "CATALYST",
"symptom": "Oscillation_Divergence",
"matched_symptom": "Commanded velocity diverges to ±∞ ...",
"similarity": 0.5864,
"injection_id": "0fd3eb2bd37c461490c4f43def243512",
"pattern_id": "pattern_output_saturation_clamp",
"latency_ms": 199
}
When the matched cluster is in staging (priority=0), the response
includes is_staging: true so the agent knows the pattern has not yet
been promoted to production:
{
"strategy": "CATALYST",
"is_staging": true,
"pattern_id": "pattern_20260518_1bfb99e13c"
}
Production clusters (priority=1 or unset) omit the key entirely for
backward compatibility.
SAFETY and FREE_EXPLORATION responses omit injection_id / pattern_id.
POST /wiki/v1/prompt/feedback
Auth: X-API-Key. Returns 204 No Content on success, 404 if the id is
unknown.
{
"injection_id": "0fd3eb2bd37c461490c4f43def243512",
"post_score": 0.83,
"iterations_to_resolve": 3,
"agent_notes": "anti-windup clamp fixed it"
}
The server computes delta_score = post_score - pre_score (where pre_score
is the last entry from the original previous_scores).
GET /wiki/v1/stats
Public, no auth. Aggregates finalised outcomes (those that have received feedback) per pattern_id, grouped by maturity bucket:
{
"staging": {
"pattern_20260518_1bfb99e13c": {
"n": 5,
"avg_uplift": 0.142,
"win_rate": 0.8,
"last_seen_iso": "2026-05-18T19:14:22+00:00"
}
},
"production": {
"pattern_output_saturation_clamp": {
"n": 8,
"avg_uplift": 0.157,
"win_rate": 0.875,
"last_seen_iso": "2026-05-18T19:14:22+00:00"
}
},
"demoted": {
"pattern_bad_habit": {
"n": 12,
"avg_uplift": -0.03,
"win_rate": 0.25,
"last_seen_iso": "2026-05-18T19:14:22+00:00"
}
},
"unbucketed": {}
}
win_rate = sum(delta_score > 0.05) / n.
The unbucketed catch-all holds pattern_ids whose owning cluster was
deleted or renamed since the outcome was recorded.
GET /wiki/v1/outcomes/export
Auth: X-API-Key. Streams every outcome (including still-pending ones) as
newline-delimited JSON. Query params:
since— ISO 8601 timestamp; only rows withts >= sinceare emitted.limit— optional row cap (max 100 000).
curl -H "X-API-Key: $ROSCLAW_HOW_API_KEY" \
"http://127.0.0.1:47820/wiki/v1/outcomes/export?since=2026-05-17T00:00:00+00:00" \
-o outcomes.jsonl
The same content is also produced by scripts/export_outcomes.py, which is
a thin CLI wrapper around this endpoint (it used to read SeekDB directly
but deadlocked against the embedded server's process-exclusive lock —
fixed in Phase 4).
GET /healthz
Public, no auth. Operational snapshot:
{
"status": "ok",
"version": "0.1.0",
"auth_enabled": true,
"seekdb_mode": "embedded",
"router_backend": "seekdb",
"cluster_count": 349,
"embedding_dim": 384,
"bridge_index_mtime": "2026-05-18T18:28:04+00:00",
"similarity_floor": 0.5,
"blind_spot_count": 0
}
blind_spot_count is the number of Unknown_Error prefix buckets that
have crossed the recurrence threshold within the active sliding window
(see GET /wiki/v1/blind_spots below).
POST /wiki/v1/admin/reload
Auth required (X-API-Key). Re-reads bridge_index.json and
code_patterns/ into SeekDB without bouncing the server. Body is
optional:
curl -X POST http://127.0.0.1:47820/wiki/v1/admin/reload \
-H "X-API-Key: $ROSCLAW_HOW_API_KEYS" \
-H "Content-Type: application/json" \
-d '{}' # incremental (delta) reload
{"rebuild": true} drops both SeekDB collections first; default is an
idempotent incremental upsert. The loader fingerprints each cluster
(standard_name + sorted patterns + sorted keywords + canonical-JSON
analogies + priority) with SHA-256 — unchanged rows skip the
sentence-transformer encode call entirely. On a 350-cluster bundle this
turns a ~4-minute full reload into a ~20-second no-op when nothing has
changed.
Rows whose IDs disappeared from the bridge — or whose priority flipped
to -1 (soft-deprecated) — are deleted from SeekDB. The response
exposes both the alive totals and the per-bucket counters so dashboards
can show "what just happened":
{
"symptoms": 349,
"patterns": 352,
"demoted_skipped": 3,
"symptoms_detail": {"added": 16, "updated": 0, "unchanged": 333, "deleted": 0},
"patterns_detail": {"added": 0, "updated": 0, "unchanged": 352, "deleted": 0},
"rebuild": false,
"duration_ms": 23900
}
After loading, the cached SemanticRouter is rebuilt synchronously so
/healthz immediately reports the fresh cluster_count / router_backend
(rather than a null window until the next CATALYST request).
POST /wiki/v1/admin/promote
Auth required (X-API-Key). Bump or set a cluster's maturity priority.
Body accepts exactly one of delta (relative change) or priority
(absolute set). The lookup key is the pattern_id (one of the *.md
files in code_patterns/); the endpoint walks bridge_index.json to
find the owning cluster whose associated_patterns list contains the
given pattern_id.
# Relative bump (capped to [-1, +1])
curl -X POST http://127.0.0.1:47820/wiki/v1/admin/promote \
-H "X-API-Key: $ROSCLAW_HOW_API_KEYS" \
-H "Content-Type: application/json" \
-d '{"pattern_id": "pattern_20260518_1bfb99e13c", "delta": 1}'
# Absolute set (also capped)
curl -X POST http://127.0.0.1:47820/wiki/v1/admin/promote \
-H "X-API-Key: $ROSCLAW_HOW_API_KEYS" \
-H "Content-Type: application/json" \
-d '{"pattern_id": "pattern_20260518_1bfb99e13c", "priority": 1}'
Response:
{
"pattern_id": "pattern_20260518_1bfb99e13c",
"cluster_id": "20260518_1bfb99e13c",
"old_priority": 0,
"new_priority": 1
}
On success, the endpoint:
- Atomically updates
bridge_index.json - Appends one JSONL row to
data/audit_log.jsonl - Re-upserts the cluster's metadata in SeekDB so the router sees the change immediately
- Invalidates the cached router
Priority semantics:
| Value | Meaning |
|---|---|
-1 |
Demoted — runtime skips (soft-deprecated) |
0 |
Staging — runtime injects with is_staging=true |
+1 |
Production — normal, no flag |
| unset | Backward compat — treated as production |
Returns 404 when pattern_id is not found in any cluster, 422
when both/neither of delta/priority is provided.
GET /wiki/v1/blind_spots
Public, no auth. Sliding-window summary of recurring Unknown_Error
prefixes — i.e. errors the catalyst layer keeps seeing but has no
matching cluster for. This is the work-list for the rosclaw-know triage
queue: each entry corresponds to a pattern we should be teaching.
{
"window_seconds": 3600,
"threshold": 3,
"active": [
{
"prefix_hash": "1a3c…",
"count": 7,
"first_seen": "2026-05-18T19:18:00+00:00",
"last_seen": "2026-05-18T19:54:12+00:00",
"sample_excerpt": "RuntimeError: undocumented quirk in controller stage",
"is_blind_spot": true
}
],
"total_unique_prefixes": 4,
"total_events": 13
}
Each crossing event also appends one JSONL row to
data/blind_spots.jsonl (configurable via
ROSCLAW_HOW_BLIND_SPOTS_PATH). A prefix is only emitted once per
window — if it goes quiet for the window length and recurs later, the
next crossing produces a fresh row.
Tuning knobs (env vars, defaults shown):
| Variable | Default | Purpose |
|---|---|---|
ROSCLAW_HOW_BLIND_SPOT_WINDOW |
3600 |
sliding-window length in seconds |
ROSCLAW_HOW_BLIND_SPOT_THRESHOLD |
3 |
events needed to flag a prefix |
ROSCLAW_HOW_BLIND_SPOTS_PATH |
data/blind_spots.jsonl |
persistent log |
GET /ui
Public, no auth. Single-page operator dashboard. Vanilla HTML + JS, no
external CDN; polls /healthz, /wiki/v1/stats, and
/wiki/v1/blind_spots every 5 seconds and renders:
- Health KPIs — version, router backend, cluster count, embedding dim, similarity floor, bridge mtime, live blind-spot count.
- Pattern uplift table — sortable by bucket (
staging/production/demoted/unbucketed), per-patternn / avg_uplift / win_rate / last_seenwith an inline bar for the uplift magnitude. - Blind spots — current recurring
Unknown_Errorprefixes (only those past threshold), with their hash, last-seen timestamp, and a truncated sample excerpt for triage.
Useful as a smoke-screen during deployments and as a low-friction view into the feedback loop without spinning up a full Grafana stack.
Architecture
rosclaw-know (offline) rosclaw-how (online, this repo)
───────────────── ───────────────────────────────
Reads 6,097 wiki/*.md Reads SeekDB at runtime
Writes data/assets/bridge_index.json Loads assets at startup
data/assets/code_patterns/* Serves build / feedback / stats / export
Reads data/exports/*.jsonl Writes outcome rows on feedback
(closing the loop)
───────────────── ───────────────────────────────
▶ ▶ ▶ assets travel from know → how
◀ ◀ ◀ outcomes travel from how → know
Source layout:
src/rosclaw_how/
__init__.py
api.py FastAPI app: 9 endpoints
asset_loader.py Startup load + --rebuild; delta-sync with content-hash
auth.py API-key header check (single-tenant in v0.1)
blind_spots.py Sliding-window tracker for Unknown_Error prefixes
config.py Typed wrapper around .env
error_normalizer.py Pure regex: error_log → 10 standardized symptom labels
inmemory_router.py RAM-frugal numpy cosine fallback (no SeekDB needed)
outcomes.py injection_outcomes persistence + per-pattern aggregation
semantic_router.py SeekDB vector search + inspiration assembly + priority gate
seekdb_client.py pyseekdb wrapper; auto-creates database; embedded+server
state_router.py SAFETY / FREE_EXPLORATION / CATALYST classifier
Router backends
ROSCLAW_HOW_ROUTER_BACKEND chooses:
auto(default) —seekdbwhen datafile exists, elseinmemoryseekdb— explicit production path; raises on init failureinmemory— explicit RAM fallback; reads bridge_index.json directly
Both routers expose the same find_nearest() contract plus cluster_count
and embedding_dim properties (surfaced on /healthz).
Runtime priority gate
When rosclaw-know's bridge_reweighter decides a cluster has been hurting
agents (negative aggregate uplift with sufficient n), it writes
"priority": -1 into the cluster entry of bridge_index.json. After the
next asset publish:
asset_loadercarries the field intosymptom_indexmetadata.SemanticRouter.find_nearestover-fetches top-3K results and walks them in similarity order, skipping any cluster withpriority < 0.InMemoryRouter.find_nearestapplies the same filter against its in-RAM matrix.
So a soft-deprecated cluster vanishes from CATALYST hits on the next asset-loader cycle without any agent code change.
What this replaces
The previous rosclaw-wiki cloud API hosted 17 declarative-knowledge endpoints
(search, judgments, code generation, etc.). Empirically, agents in
Frontier-Engineering's optimization loop regressed ~20% when they pulled from
those endpoints — they got encyclopedic context when they needed a poke.
rosclaw-how is the focused replacement: nine endpoints, three strategies,
≤400 tokens per CATALYST snippet, no LLM in the hot path, and a feedback
loop that keeps the asset bundle honest.
Closed-loop verification
Two harnesses, each tuned to a different cost/coverage budget:
-
scripts/verify_how_seekdb.py— strict 4-case verifier that pre-flights/healthz(refuses non-seekdbbackends), then asserts each case isCATALYSTwithsimilarity ≥ similarity_floorandlatency_ms < 1500. Writesdata/benchmarks/how_ab_seekdb/summary.json. -
scripts/verify_how_lite.py— A/B against DeepSeek for 4 stuck cases (control: FREE_EXPLORATION; treatment: CATALYST). Used by the rosclaw-know side'sreplay_benchmark.pyto drive 50+ synthetic rollouts end-to-end through build → inject → feedback → distill → re-publish.
export ROSCLAW_HOW_API_KEY=rw_sk_dev_local
# A/B against the deployed service
python scripts/verify_how_seekdb.py
# Faster smoke against DeepSeek, no Frontier-Engineering setup needed
python scripts/verify_how_lite.py --no-agent
# Heavy: hits the real Frontier-Engineering eval (needs that repo)
python scripts/verify_how.py --iterations 500
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rosclaw_how-1.0.1.tar.gz.
File metadata
- Download URL: rosclaw_how-1.0.1.tar.gz
- Upload date:
- Size: 635.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
80271f92d9f07ae0af433fd830f5c5bfd0a39e8cbbcff41151e71f7ebdd20214
|
|
| MD5 |
e0eb9976d7234493705631d41b9aeca3
|
|
| BLAKE2b-256 |
41f8e64cb3c47bd05a133bf740875a7d5ca7909842fadad125b0fadf7aa62b86
|
File details
Details for the file rosclaw_how-1.0.1-py3-none-any.whl.
File metadata
- Download URL: rosclaw_how-1.0.1-py3-none-any.whl
- Upload date:
- Size: 906.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
78e12b818fac0776bc71802373b84d9a69b06e0094b5a85c68d327def394f7dd
|
|
| MD5 |
19deda35158d5534595dac5d0abc59e1
|
|
| BLAKE2b-256 |
9fa766201de777c657cb4483719fbfb4aa7eaf1d96308143070540f7c8985a63
|