Skip to main content

Stop/ship/escalate signal for multi-agent conversations. Measures diminishing returns, not confidence. Includes RPO (Renderable Prompt Object) for structured prompt IR.

Project description

Diminishing Returns

Two ideas enter. One decision leaves.

ASCII fallback
   TWO IDEAS ENTER
        │
        ▼
    ┌─────────┐
    │  DR PIT  │   (Thunderdome mode)
    └────┬────┘
         ▼
   ONE DECISION LEAVES

A small utility for measuring diminishing returns in multi-agent / multi-LLM conversations.

┌──────────────────────────────────────────────────────────────┐
│                     DR Scoring Pipeline                       │
│                                                              │
│  Transcript                                                  │
│  ┌────────┐  ┌────────┐  ┌────────┐                         │
│  │Round 1 │  │Round 2 │  │Round 3 │  ...                    │
│  │claims  │  │claims  │  │claims  │                          │
│  └───┬────┘  └───┬────┘  └───┬────┘                         │
│      │           │           │                               │
│      ▼           ▼           ▼                               │
│  ┌─────────────────────────────────┐                         │
│  │  Novelty detection (L0 + L1)   │  new claims / peak      │
│  │  Action readiness scoring      │  specific + unblocked?   │
│  │  K-consecutive stopping rule   │  k=2 low rounds → done   │
│  └────────────────┬────────────────┘                         │
│                   ▼                                          │
│         ┌─────────────────┐                                  │
│         │  SHIP           │  Converged + action-ready        │
│         │  CONTINUE       │  Still producing novelty         │
│         │  ESCALATE       │  Blocked or stalled              │
│         └─────────────────┘                                  │
│                   │                                          │
│                   ▼  (optional)                               │
│         ┌─────────────────┐     ┌────────────────┐           │
│         │  DR attestation │────▶│  RPO warm_state │          │
│         │  (trust signal) │     │  (next round)   │          │
│         └─────────────────┘     └────────────────┘           │
└──────────────────────────────────────────────────────────────┘

Calibration finding: DR's recall scales with conversation structure — 0% on freeform debate (IQ2), 4% on Reddit threads (CMV), 32% on group deliberation (DeliData), 100% on structured agent loops (livefire). It's tuned for agents, not humans arguing. See docs/calibration-results.md.

Start here

Docs:

Examples:

  • examples/ — small clean-room transcripts to understand the output

This is not "confidence." It's a stop/ship signal: are we still producing novel, decision-relevant information?


📏 What it measures (v0.2 draft implementation)

A weighted score plus a stop recommendation from observable transcript signals. Currently implemented:

  • Novelty rate (L0 + L1): net-new claims after normalization plus Jaccard fuzzy matching for paraphrase-lite repeats (implemented, no embeddings).
  • 🛠️ Action readiness: weighted readiness from next-action specificity, open-question trend, and blocker detection (implemented).
  • Decision matrix stop signal: CONTINUE | SHIP | ESCALATE from novelty + readiness (implemented).

Planned next:

  • 🧠 Semantic convergence: are two agents saying the same thing? (requires embeddings)
  • 🧱 Structural agreement: are agents modifying each other or just rephrasing?
  • Novelty L2 embeddings: semantic novelty matching from docs/novelty-and-readiness-spec.md (documented TODO, not implemented).

Design note: a conversation can converge on the wrong answer. DR measures diminishing returns, not truth.

🧭 Why

Teams waste cycles in "one more round" loops.

A diminishing-returns meter nudges you toward the next correct move:

  • name the decision
  • assign the next action
  • run verification (tests, reproduce steps, check evidence)

🚀 Quick start

pip install diminishing-returns              # from PyPI (pre-release)
pip install diminishing-returns==0.1.0a1     # pin version

# or from source
git clone https://github.com/Pro777/diminishing-returns.git
cd diminishing-returns
python -m pip install -e .

# CLI
dr score transcript.json
dr stop transcript.json
dr watch trace.jsonl          # live JSONL tailing
dr attest "cache decision" transcript.json

🧾 Output

dr score prints a single JSON object with these top-level keys:

  1. score
  2. components
  3. novelty_by_round
  4. readiness_by_round
  5. stop_recommendation
  6. hint
  7. semantic_by_round

dr stop prints a compact stop/ship verdict for loops:

Signal: SHIP
Why:
- Novelty is LOW (k-consecutive low rounds: 2).
- Action readiness is HIGH.
- Classifications: novelty=LOW, readiness=HIGH.
Next action:
- Ship the decision and run verification (tests, repro, or evidence checks).
{
  "score": 1.0,
  "components": {
    "semantic_similarity": null,
    "novelty_rate": 0.0,
    "novelty_rate_L0": 0.0,
    "novelty_rate_L1": 0.0,
    "structural_agreement": null,
    "action_readiness": 0.85,
    "action_readiness_detail": {
      "next_actions_score": 0.7,
      "open_questions_score": 1.0,
      "blocker_score": 1.0
    }
  },
  "novelty_by_round": [
    {
      "round": 1,
      "claims": 4,
      "new_claims": 4,
      "new_claims_L0": 4,
      "new_claims_L1": 4,
      "novelty_rate": 1.0,
      "novelty_rate_L0": 1.0,
      "novelty_rate_L1": 1.0
    }
  ],
  "readiness_by_round": [
    {
      "round": 1,
      "action_readiness": 0.85,
      "readiness_classification": "HIGH",
      "next_actions_score": 0.7,
      "open_questions_score": 1.0,
      "blocker_score": 1.0
    }
  ],
  "stop_recommendation": {
    "signal": "SHIP",
    "novelty_classification": "LOW",
    "readiness_classification": "HIGH",
    "k_consecutive_low_novelty": 2,
    "rationale": "Novelty is LOW (k-consecutive low rounds: 2). Action readiness is HIGH."
  },
  "hint": "Converged. Ship the decision and verify."
}

Note: novelty_by_round and readiness_by_round are arrays with one entry per transcript round; the example above is abbreviated.

🧪 Examples

Each example includes a diminishing_returns_note.recommended_stop_round to make expected behavior explicit.

🌐 DR as Protocol

DR started as a scoring library. It's becoming a trust signal for inter-agent communication.

When Agent A sends a recommendation to Agent B, a DR attestation tells B: how much scrutiny did this receive?

Three trust tiers: local (markdown, trusted agents), federated (signed, partially trusted), internet (full evidence audit, untrusted).

⚠️ Status and Limitations

This project is pre-release (v0.1.0a1). It works, but carries honest caveats:

  • L0 + L1 novelty and readiness are implemented. semantic_similarity and structural_agreement still return null (deterministic-first; embeddings planned for v0.2).
  • No external dependencies. By design — no embeddings, no NLP, no ML. The v0.1 scorer is deliberately simple.
  • Calibrated against 3 external corpora (IQ2, ChangeMyView, DeliData) plus 12 livefire scenarios. Domain boundary is validated: structured agent loops are the sweet spot. See docs/calibration-results.md.
  • 67/67 tests passing, including cross-repo integration with RPO (12/12 e2e).

For a deeper critique, see docs/devils-advocate.md.

📎 References (receipts)

If you want the nerdy provenance: see

License

MIT.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

diminishing_returns-0.1.0a1.tar.gz (60.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

diminishing_returns-0.1.0a1-py3-none-any.whl (37.0 kB view details)

Uploaded Python 3

File details

Details for the file diminishing_returns-0.1.0a1.tar.gz.

File metadata

  • Download URL: diminishing_returns-0.1.0a1.tar.gz
  • Upload date:
  • Size: 60.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for diminishing_returns-0.1.0a1.tar.gz
Algorithm Hash digest
SHA256 134c672a0370e885f15ac5608dad4134e9cb675e307796f410535ec3beea3c36
MD5 6f6b3954504257bfcf60ce1609349d2e
BLAKE2b-256 30dfff1080d76a55cf8ecb5af068a388d370cb32e1f6bc37f22d45026eab7e32

See more details on using hashes here.

Provenance

The following attestation bundles were made for diminishing_returns-0.1.0a1.tar.gz:

Publisher: publish.yml on Pro777/diminishing-returns-private

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file diminishing_returns-0.1.0a1-py3-none-any.whl.

File metadata

File hashes

Hashes for diminishing_returns-0.1.0a1-py3-none-any.whl
Algorithm Hash digest
SHA256 b10bc49850aa63164aa9512d9886a36fda02cd3a0d60b0747cf2f88f28f6a684
MD5 fd74cdd6564d465d661c6f0fbcbdab28
BLAKE2b-256 1b02a8fc8b2cdea2e38e99e6d9213ce32a34ea39dd4b5e800d2ebd133601b1e5

See more details on using hashes here.

Provenance

The following attestation bundles were made for diminishing_returns-0.1.0a1-py3-none-any.whl:

Publisher: publish.yml on Pro777/diminishing-returns-private

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page