Skip to main content

Multi-agent collaboration through pairwise comparisons

Project description

Arbitron ⚖️

Arbitron is an agentic pairwise comparison engine. Multiple jurors, each with unique value systems, evaluate items head-to-head and produce a set of pairwise comparisons that can be used to derive item's ranks and weights.

  • Why pairwise? It's easier to compare two items than to assign absolute scores.
  • Why multi-juror? Different models with different perspectives (instructions) lead to more balanced, less biased outcomes.

✨ Features

  • 🎯 Arbitrary Sets. Evaluate text, code, products, ideas
  • 🤖 Customizable Jurors. Specify custom instructions, tools, providers
  • 🛡️ Bias Reduction. Ensemble decision-making
  • 🧩 Remixable — Join data with human labels and apply personalized heuristics

🚀 Quickstart

Running your first Arbitron "contest" is easy!

pip install arbitron

Setup your favorite LLM provider's API keys in the environment (e.g: OPENAI_API_KEY) and then run the following code.

from arbitron import Competition, Item, Juror

items = [
    Item(id="arrival"),
    Item(id="interstellar"),
    Item(id="inception"),
]

jurors = [
    Juror(id="SciFi Purist", model="openai:gpt-5-nano"),
]

competition = Competition(
    id="sci-fi-soundtracks",
    description="Which movie has the better soundtrack?",
    jurors=jurors,
    items=items,
)

for comparison in competition.run():
    print(comparison)

print(f"Total cost: {competition.cost}")

🏛️ License

MIT License - see LICENSE file for details.

🙌 Acknowledgments

  • DeepGov and their use of AI for Democratic Capital Allocation and Governance.
  • Daniel Kronovet for his many writings on the power of pairwise comparisons.

Margur veit það sem einn veit ekki. Many know what one does not know.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arbitron-0.5.6.tar.gz (6.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

arbitron-0.5.6-py3-none-any.whl (9.2 kB view details)

Uploaded Python 3

File details

Details for the file arbitron-0.5.6.tar.gz.

File metadata

  • Download URL: arbitron-0.5.6.tar.gz
  • Upload date:
  • Size: 6.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.5

File hashes

Hashes for arbitron-0.5.6.tar.gz
Algorithm Hash digest
SHA256 f6f6e165e3d1bf0d322a3be9bd87f6879f18b97b15b91ea8a69348e9ea7e3507
MD5 1f037c05545ab781d0ead33bd3f5f0b7
BLAKE2b-256 15e4ab7352847570a5cfb4320eb3a98dc748dd68f895e9c034399068cc112a39

See more details on using hashes here.

File details

Details for the file arbitron-0.5.6-py3-none-any.whl.

File metadata

  • Download URL: arbitron-0.5.6-py3-none-any.whl
  • Upload date:
  • Size: 9.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.5

File hashes

Hashes for arbitron-0.5.6-py3-none-any.whl
Algorithm Hash digest
SHA256 bb48bfb4a3137b8dea54f71c4f38c87203dbf3b7b952a4331598ed4a9a489b41
MD5 dc1b8a4d635d6f324905d25b67ad2c0d
BLAKE2b-256 d2d763be10701f746367d5a7a462881fcec07b7afb948c69e1ea1c2092981af5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page