Skip to main content

Governed reranking — steer any ranked list toward policy objectives without breaking accuracy

Project description

governed-rank

Governed reranking for any domain — steer ranked lists toward policy objectives without breaking accuracy.

The Problem

You have a ranked list (search results, recommendations, content feed) and a policy objective (reduce toxicity, increase fairness, promote margin). Naively injecting policy scores breaks your base ranker's accuracy. You need a principled way to steer without regressing.

Install

pip install governed-rank

Quick Start

from mosaic import govern

result = govern(
    base_scores={"doc1": 0.9, "doc2": 0.8, "doc3": 0.7, "doc4": 0.6, "doc5": 0.5},
    steering_scores={"doc1": -0.5, "doc2": 0.3, "doc3": 0.8, "doc4": 0.1, "doc5": 0.6},
    budget=0.3,
)

print(result.ranked_items)   # reranked order
print(result.receipts)       # per-item audit trail

budget controls how much of the original ordering is protected. 0.3 means the top 30% most confident base decisions are locked — steering can only move items around where the base ranker isn't sure.

Domain Examples

Content Moderation — demote toxic content without hurting engagement

result = govern(
    base_scores=engagement_scores,       # relevance / predicted engagement
    steering_scores=toxicity_penalties,   # negative = toxic, from your classifier
    budget=0.3,
)
# Toxic items drop in ranking. Engagement-critical ordering preserved.

Fairness — boost underrepresented groups without sacrificing quality

result = govern(
    base_scores=quality_scores,          # hiring model / credit scores / quality
    steering_scores=fairness_boosts,     # positive for underrepresented candidates
    budget=0.3,
)
# Fair reranking with auditable receipts. Adverse impact ratio 0.963 on COMPAS.

RAG Safety — steer retrieval toward grounded, policy-safe documents

result = govern(
    base_scores=retrieval_scores,        # embedding similarity from your vector DB
    steering_scores=groundedness_scores,  # factuality / policy compliance signal
    budget=0.5,
)
# Grounded docs promoted. Top retrieval results still relevant.

How It Works

Three steps, fully automatic:

1. Orthogonalize    Remove the component of your policy signal that correlates
                    with the base ranker (so steering can't accidentally hurt accuracy)

2. Protect edges    Lock the most confident base ordering decisions (controlled by budget)

3. Project          Isotonic regression on the remaining items — maximize policy
                    effect while respecting constraints

Mathematically: the steering signal is projected into the null space of the base score direction, then a constrained isotonic projection enforces protected ordering decisions. The result is Pareto-optimal — you cannot get more policy effect without giving up more accuracy.

Full API — MOSAICScorer

For production pipelines with calibrated confidence, moment activation, satiation, and per-item receipts:

from mosaic import MOSAICScorer, MOSAICConfig, CalibrationResult

calibration = CalibrationResult.load("models/gap_calibration.json")

scorer = MOSAICScorer(
    moment_affinities=moment2vec,        # (n_items, K) affinity matrix
    calibration=calibration,
    config=MOSAICConfig(
        lambda_m=0.03,
        rho=0.90,
        protection_mode="budget",
        budget_pct=0.30,
    ),
)

result = scorer.rank(
    candidates=[1, 2, 3, 4, 5],
    base_scores={1: 0.9, 2: 0.8, 3: 0.7, 4: 0.6, 5: 0.5},
    activation_p=activation_probabilities,
    activation_confidence="high",
)

print(result.ranked_items)
print(result.receipts)           # MOSAICReceipt per item
print(result.n_protected_edges)

Discovery Engine

Find which policies are naturally aligned with your users before deploying:

from mosaic.discovery import DiscoveryEngine

engine = DiscoveryEngine()
report = engine.discover(sessions, catalog)

for opp in report.top_opportunities(5):
    print(f"{opp.category}: {opp.preference_lift:.1f}x lift — {opp.action.value}")

Validated On

17 datasets across 6 domains:

Domain Datasets Key Metric
Recommendations Ta Feng, Instacart, RetailRocket, Criteo, MovieLens, Amazon Reviews, Yelp 0.890 stability @ 0.344 exposure
Fairness COMPAS, Adult Income, German Credit adverse_impact_ratio = 0.963
Healthcare MIMIC-IV, SynPUF 71.6% HIGH tier, 5.0x NMF lift
Content / NLP AG News, BBC News, Mind News cross-domain discovery
Fraud IEEE-CIS Fraud policy-steered detection
Cookieless Targeting RetailRocket, Criteo 4.65x CVR lift

Core Modules

Module Purpose
mosaic.govern Simple govern() entry point
mosaic.mosaic_scorer Full pipeline with moments, calibration, receipts
mosaic.orthogonalization Score-space interference removal
mosaic.gap_calibration Gap → confidence mapping, edge protection
mosaic.isotonic_projection Constrained isotonic regression (PAV)
mosaic.discovery Objective discovery from behavioral data

License

Apache 2.0 — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

governed_rank-0.1.0.tar.gz (129.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

governed_rank-0.1.0-py3-none-any.whl (101.7 kB view details)

Uploaded Python 3

File details

Details for the file governed_rank-0.1.0.tar.gz.

File metadata

  • Download URL: governed_rank-0.1.0.tar.gz
  • Upload date:
  • Size: 129.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.12

File hashes

Hashes for governed_rank-0.1.0.tar.gz
Algorithm Hash digest
SHA256 fff47384f9fed57c210558ad751d1b547addd401d03cc4c97da17bdbbec71127
MD5 5b5a444f6c3238a11dc5df0a60f7ef9e
BLAKE2b-256 65cd3781e4250f4b5c56af4242c14e28ab7235162920938c9427c442496cc24d

See more details on using hashes here.

File details

Details for the file governed_rank-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: governed_rank-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 101.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.12

File hashes

Hashes for governed_rank-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6ed400e2307d92f7aa8c26b1000866126b5d693580bec2cc89d94bf85552f4b8
MD5 f8f7cf8870bea59c3e3460590fa484db
BLAKE2b-256 38c33f2e88fc9e85d43e2b8542c2a03408e08d58263e54060b9c428682bb1a78

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page