Skip to main content

Multi-agent consensus rankings through pairwise comparisons

Project description

Arbitron ⚖️

Arbitron is a multi-agent consensus ranking system that determines winners through pairwise comparisons. Instead of relying on individual ratings or single perspectives, multiple agents—each with unique value systems—evaluate items head-to-head to produce robust, fair rankings.

Why pairwise? It's easier to compare two items than to assign absolute scores.

Why multi-agent? Different perspectives lead to more balanced, less biased outcomes.

Why consensus? Aggregated judgments are more reliable than individual ones.

✨ Features

  • 🎯 Arbitrary Sets — text, code, products, ideas
  • 🤖 Customizable Agents — unique personas, tools, providers
  • 🗳️ Consistent Aggregation — diverse perspectives into rankings
  • 🛡️ Bias Reduction — ensemble decision-making
  • 🧩 Remixable — join data with human labels or other heuristics

🚀 Quickstart

Running your first Arbitron contest is easy!

pip install arbitron

Setup your favorite LLM provider's API keys in the environment and then run the following code.

import arbitron

movies = [
    arbitron.Item(id="arrival"),
    arbitron.Item(id="blade_runner"),
    arbitron.Item(id="interstellar"),
    arbitron.Item(id="inception"),
    arbitron.Item(id="the_dark_knight"),
    arbitron.Item(id="dune"),
    arbitron.Item(id="the_matrix"),
    arbitron.Item(id="2001_space_odyssey"),
    arbitron.Item(id="the_fifth_element"),
    arbitron.Item(id="the_martian"),
]

agents = [
    arbitron.Agent(
        id="SciFi Purist",
        prompt="Compare based on scientific accuracy and hard sci-fi concepts.",
    ),
    arbitron.Agent(
        id="Nolan Fan",
        prompt="Compare based on complex narratives and emotional depth.",
    ),
    arbitron.Agent(
        id="Critics Choice",
        prompt="Compare based on artistic merit and cinematic excellence.",
    ),
]

description = "Rank the movies based on their soundtrack quality."

comparisons = arbitron.run(description, agents, movies)
ranking = arbitron.rank(comparisons)

YAML Based Configuration

Arbitron can read the configuration from YAML files for easier management and scalability.

description: "Rank the movies based on their soundtrack quality."
comparisons_per_agent: 20
agents:
  - id: "SciFi Purist"
    prompt: "Compare based on scientific accuracy and hard sci-fi concepts."
    model: "google-gla:gemini-2.5-flash-lite"
  - id: "Nolan Fan"
    prompt: "Compare based on complex narratives and emotional depth."
    model: "openai:gpt-4.1-nano"
  - id: "Critics Choice"
    prompt: "Compare based on artistic merit and cinematic excellence."
    model: "groq:moonshotai/kimi-k2-instruct"
items:
  - id: "arrival"
  - id: "blade_runner"
  - id: "interstellar"
  - id: "inception"
  - id: "the_dark_knight"
  - id: "dune"
  - id: "the_matrix"
  - id: "2001_space_odyssey"
  - id: "the_fifth_element"
  - id: "the_martian"

You can then run the pairwise comparisons using the following command:

arbitron run competition.yaml --output duels.csv

This will create a file duels.csv with the results of the pairwise comparisons.

arbitron rank duels.csv

❓ Why "Arbitron"?

The name comes from the Old Icelandic word "val" meaning choice or selection. In Norse culture, having many choices among stories was a sign of abundance. Arbitron embodies this spirit—providing rich choice through diverse agent perspectives.

Also inspired by Philip K. Dick's novel VALIS (Vast Active Living Intelligence System), reflecting the distributed, intelligent nature of multi-agent consensus.

🏛️ License

MIT License - see LICENSE file for details.

🙌 Acknowledgments

  • DeepGov and their use of AI for Democratic Capital Allocation and Governance.
  • Daniel Kronovet for his many writings on the power of pairwise comparisons.

Margur veit það sem einn veit ekki. Many know what one does not know.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arbitron-0.3.0.tar.gz (6.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

arbitron-0.3.0-py3-none-any.whl (9.5 kB view details)

Uploaded Python 3

File details

Details for the file arbitron-0.3.0.tar.gz.

File metadata

  • Download URL: arbitron-0.3.0.tar.gz
  • Upload date:
  • Size: 6.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.4

File hashes

Hashes for arbitron-0.3.0.tar.gz
Algorithm Hash digest
SHA256 6db86ab1f99cf763dbf5a1577e9894d47e6ec75ee25307a46c1934e6743b8a55
MD5 de69a95c0b29f9a9eb27d93554ede276
BLAKE2b-256 2efc89d27f9ca285176fc63026c90771a728bf2258a4f7cc141da2d4fecb03da

See more details on using hashes here.

File details

Details for the file arbitron-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: arbitron-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 9.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.4

File hashes

Hashes for arbitron-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c200094867d49d734137a3ddffedfb16c0b4f5951c13a9faf9868940c7df5dc6
MD5 688fe8f6364773196f8e13110f4f0243
BLAKE2b-256 5acadc1371771d9867b45b4073156d8135cdba09546f6f33a2c372125491169b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page