Skip to main content

Multi-agent collaboration through pairwise comparisons

Project description

Arbitron ⚖️

Arbitron is an agentic pairwise comparison engine. Multiple jurors, each with unique value systems, evaluate items head-to-head and produce a set of pairwise comparisons that can be used to derive item's ranks and weights.

  • Why pairwise? It's easier to compare two items than to assign absolute scores.
  • Why multi-juror? Different models with different perspectives (instructions) lead to more balanced, less biased outcomes.

✨ Features

  • 🎯 Arbitrary Sets. Evaluate text, code, products, ideas
  • 🤖 Customizable Jurors. Specify custom instructions, tools, providers
  • 🛡️ Bias Reduction. Ensemble decision-making
  • 🧩 Remixable — Join data with human labels and apply personalized heuristics

🚀 Quickstart

Running your first Arbitron "contest" is easy!

pip install arbitron

Setup your favorite LLM provider's API keys in the environment (e.g: OPENAI_API_KEY) and then run the following code.

from arbitron import Competition, Item, Juror

items = [
    Item(id="arrival"),
    Item(id="interstellar"),
    Item(id="inception"),
]

jurors = [
    Juror(id="SciFi Purist", model="openai:gpt-5-nano"),
]

competition = Competition(
    id="sci-fi-soundtracks",
    description="Which movie has the better soundtrack?",
    jurors=jurors,
    items=items,
)

for comparison in competition.run():
    print(comparison)

print(f"Total cost: {competition.cost}")

🏛️ License

MIT License - see LICENSE file for details.

🙌 Acknowledgments

  • DeepGov and their use of AI for Democratic Capital Allocation and Governance.
  • Daniel Kronovet for his many writings on the power of pairwise comparisons.

Margur veit það sem einn veit ekki. Many know what one does not know.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arbitron-0.5.4.tar.gz (6.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

arbitron-0.5.4-py3-none-any.whl (8.5 kB view details)

Uploaded Python 3

File details

Details for the file arbitron-0.5.4.tar.gz.

File metadata

  • Download URL: arbitron-0.5.4.tar.gz
  • Upload date:
  • Size: 6.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.4

File hashes

Hashes for arbitron-0.5.4.tar.gz
Algorithm Hash digest
SHA256 b6ab31c53045faede0edfdd954f27ee3c549afe4b30a49cf29f63fe76cbd8f06
MD5 0b64de1543232a497a14506cc3642fe9
BLAKE2b-256 019e3e606c2c6c07b775ce968ab4c3af35cf42746fda8205a5b85fb6c3b88eff

See more details on using hashes here.

File details

Details for the file arbitron-0.5.4-py3-none-any.whl.

File metadata

  • Download URL: arbitron-0.5.4-py3-none-any.whl
  • Upload date:
  • Size: 8.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.4

File hashes

Hashes for arbitron-0.5.4-py3-none-any.whl
Algorithm Hash digest
SHA256 d269c865f5f322fadd51d5d748d583fe4ae9664d8c84cac6d6518c6acbf5c3a1
MD5 f112dbd9f7251b1c657e387e61d7ca9d
BLAKE2b-256 5f8f6895baf29a23592322ba0fa6fbed2cb7d2a29957d67ea589986a690e9fd4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page