Skip to main content

Multi-agent collaboration through pairwise comparisons

Project description

Arbitron ⚖️

Arbitron is an agentic pairwise comparison engine. Multiple jurors, each with unique value systems, evaluate items head-to-head and produce a set of pairwise comparisons that can be used to derive item's ranks and weights.

  • Why pairwise? It's easier to compare two items than to assign absolute scores.
  • Why multi-juror? Different models with different perspectives (instructions) lead to more balanced, less biased outcomes.

✨ Features

  • 🎯 Arbitrary Sets. Evaluate text, code, products, ideas
  • 🤖 Customizable Jurors. Specify custom instructions, tools, providers
  • 🛡️ Bias Reduction. Ensemble decision-making
  • 🧩 Remixable — Join data with human labels and apply personalized heuristics

🚀 Quickstart

Running your first Arbitron "contest" is easy!

pip install arbitron

Setup your favorite LLM provider's API keys in the environment (e.g: OPENAI_API_KEY) and then run the following code.

from arbitron import Competition, Item, Juror

items = [
    Item(id="arrival"),
    Item(id="interstellar"),
    Item(id="inception"),
]

jurors = [
    Juror(id="SciFi Purist", model="openai:gpt-5-nano"),
]

competition = Competition(
    id="sci-fi-soundtracks",
    description="Which movie has the better soundtrack?",
    jurors=jurors,
    items=items,
)

for comparison in competition.run():
    print(comparison)

print(f"Total cost: {competition.cost}")

🏛️ License

MIT License - see LICENSE file for details.

🙌 Acknowledgments

  • DeepGov and their use of AI for Democratic Capital Allocation and Governance.
  • Daniel Kronovet for his many writings on the power of pairwise comparisons.

Margur veit það sem einn veit ekki. Many know what one does not know.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arbitron-0.5.3.tar.gz (6.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

arbitron-0.5.3-py3-none-any.whl (8.6 kB view details)

Uploaded Python 3

File details

Details for the file arbitron-0.5.3.tar.gz.

File metadata

  • Download URL: arbitron-0.5.3.tar.gz
  • Upload date:
  • Size: 6.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.4

File hashes

Hashes for arbitron-0.5.3.tar.gz
Algorithm Hash digest
SHA256 8e15f4eebccd0aaa3032f3679a8dbf45f6b3f233bb58716dfb5f6b64705463bf
MD5 1bf1970bb9e02af96fd34b2d3f5c935b
BLAKE2b-256 83195d43c473e7f85347177b9a9872a2c1eee66aded26a7937882cb85cd9511f

See more details on using hashes here.

File details

Details for the file arbitron-0.5.3-py3-none-any.whl.

File metadata

  • Download URL: arbitron-0.5.3-py3-none-any.whl
  • Upload date:
  • Size: 8.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.4

File hashes

Hashes for arbitron-0.5.3-py3-none-any.whl
Algorithm Hash digest
SHA256 c7577e52d9b95777ce3eee927887c0ef3ab65bbfac521bf0c2da238d99570776
MD5 3f9fb64e2d15366928117e9049430ec7
BLAKE2b-256 8afc1dd860bd0073dfa2da81a5b255cca59d167eb6e6c781339b7c2a9e3f11da

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page