MCP server exposing principled social-choice aggregation rules (Borda, Copeland, Condorcet, approval, STV, opinion pool), with a reproducible benchmark measuring their accuracy vs majority vote over an LLM ensemble.

These details have not been verified by PyPI

Project links

Project description

voting-mcp

Principled social-choice aggregation as MCP tools — with a benchmark that measures the accuracy lift over naive majority vote.

Almost every multi-agent system aggregates votes with Counter(votes).most_common(1), throwing away preference order and confidence. voting-mcp ships the real rules (Borda, Copeland, Condorcet, approval, STV, linear opinion pool) as callable MCP tools — each with its known axiomatic behavior and explicit, documented tie-breaking — plus a reproducible benchmark that aggregates a diverse ensemble of LLMs on a reasoning set and reports accuracy with bootstrap confidence intervals.

The server is pure compute: stdio transport, no network, no file writes, no secrets — clean against the OWASP MCP Top 10 by construction.

Install

# run the server directly (once published)
uvx voting-mcp

# or from source
git clone https://github.com/HrishiKabra/voting-mcp && cd voting-mcp
uv sync
uv run python -m voting_mcp.server

Add it to an MCP client (e.g. Claude Desktop claude_desktop_config.json):

{
  "mcpServers": {
    "voting": { "command": "uvx", "args": ["voting-mcp"] }
  }
}

Tools

Every tool takes a profile ({candidates, ballots}) and returns a Result with the full co-winner set (winners, so ties are never hidden), the single tie-broken winner (or null when none exists), a ranking, per-candidate scores, and a note.

Tool	Ballots	Notes
`borda`	rankings	positional; Condorcet-inconsistent, clone-sensitive
`copeland`	rankings	Condorcet-consistent pairwise (+1 win, +0.5 tie)
`condorcet`	rankings	returns the pairwise winner or an explicit no-winner on a cycle
`approval`	approval sets	most-approved wins
`stv`	rankings	single-winner instant-runoff; clone-resistant
`opinion_pool`	distributions	linear pool — preserves confidence, not an argmax vote
`plurality`	rankings	baseline (most first choices)
`majority`	rankings	strict >50% or no winner
`aggregate_rule`	any	dispatch by a `rule` enum

Tie-breaking is an explicit parameter (lexicographic default, none, or seeded random).

Benchmark

Aggregate an ensemble of 5 models (one OpenAI-compatible client via OpenRouter) on ARC-Challenge and compare each rule to the naive majority vote:

uv sync --extra bench
uv run python -m bench.fetch_arc --limit 200
# prints a cost estimate and STOPS; add --yes to actually call the API, --mock for a free dry run
uv run python -m bench.run_ensemble --dataset bench/datasets/arc_challenge.jsonl --limit 200 --yes
uv run python -m bench.compare --dataset bench/datasets/arc_challenge.jsonl --limit 200

Every raw response is cached under bench/results/raw/; re-runs never re-call the API, so aggregation tweaks are free.

Results

5-model ensemble (gpt-4o-mini · gemini-2.5-flash-lite · deepseek-v3 · claude-haiku-4.5 · glm-4.7), n = 200, bootstrap 95% CI. Two datasets of different difficulty; full write-up and both plots in RESULTS.md.

MMLU-Pro (hard, baseline 73.5%) — the informative case:

Rule	Accuracy	95% CI	paired Δ vs majority	p
opinion_pool	0.755	[0.695, 0.815]	+0.020 [−0.011, +0.052]	0.225
majority_vote (baseline)	0.735	[0.679, 0.788]	—	—
approval	0.701	[0.640, 0.757]	−0.035 [−0.063, −0.006]	0.014
stv	0.693	[0.630, 0.750]	−0.043 [−0.072, −0.015]	0.002
copeland	0.647	[0.580, 0.710]	−0.088 [−0.127, −0.052]	<0.001
condorcet	0.620	[0.550, 0.685]	−0.115 [−0.155, −0.079]	<0.001
majority (strict)	0.590	[0.520, 0.655]	−0.145 [−0.189, −0.105]	<0.001
borda	0.472	[0.405, 0.540]	−0.263 [−0.323, −0.206]	<0.001

Δ is tested with a paired bootstrap on the per-question accuracy difference (same questions, so shared difficulty cancels), not by eyeballing the independent CIs.

MMLU-Pro

The finding (honest): the value isn't "fancy voting beats majority." It's that the confidence-preserving rule (opinion_pool) wins when the crowd is uncertain (+2.0pp, the only rule above baseline — suggestive but not significant at n=200, paired p=0.225), while forcing the distributions into full rankings actively hurts, significantly — every ranking rule is below baseline at paired p≤0.014, and borda collapses to 0.472 because with 10 options the tail of the ranking is mostly noise. Aggregate the confidence; don't throw it away. On ARC-Challenge (baseline 96.8%, near-ceiling) nothing separates — no rule differs significantly. See RESULTS.md.

Develop

uv run pytest -q
uv run ruff check .
uv run mypy --strict src
# exercise the tools in the MCP Inspector:
npx @modelcontextprotocol/inspector uv run python -m voting_mcp.server

Note: if you keep this repo under an iCloud-synced folder (e.g. ~/Desktop), iCloud can spawn duplicate .pth files that intermittently break the editable install. Tests use pythonpath=src; run the server with PYTHONPATH=src if an import fails, or move the repo off the synced folder.

Related research

The choice of rules here grows out of the author's work on voting-rule design: Optimizing Voting Rules for Social Welfare and Beyond (AAMAS). That line of work asks which aggregation rules maximize welfare given how voters actually express preferences; this project applies the same lens to LLM ensembles — where the benchmark's answer is that confidence-preserving aggregation (the linear opinion pool) is what pays off, and forcing cardinal beliefs into ordinal rankings destroys signal.

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.3

Jul 2, 2026

This version

0.1.2

Jul 2, 2026

0.1.1

Jul 2, 2026

0.1.0

Jun 30, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voting_mcp-0.1.2.tar.gz (208.2 kB view details)

Uploaded Jul 2, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

voting_mcp-0.1.2-py3-none-any.whl (18.7 kB view details)

Uploaded Jul 2, 2026 Python 3

File details

Details for the file voting_mcp-0.1.2.tar.gz.

File metadata

Download URL: voting_mcp-0.1.2.tar.gz
Upload date: Jul 2, 2026
Size: 208.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for voting_mcp-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`44feb1bc2ec1331a3f7b59314acb418bfcf04757ffa6aaf83fa78a00871151ba`
MD5	`111913d7e778f31e19bdd99316d4cdd3`
BLAKE2b-256	`1aa6a6ab2942b747d5d74fcf786964d8bd06d6cfa48393521d572dca4e6c4d5b`

See more details on using hashes here.

File details

Details for the file voting_mcp-0.1.2-py3-none-any.whl.

File metadata

Download URL: voting_mcp-0.1.2-py3-none-any.whl
Upload date: Jul 2, 2026
Size: 18.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for voting_mcp-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5bfbec45d22b5a4707931e5b00bf553480cf0b9b98836863a95f366500af7059`
MD5	`bdae20a21686b682fe1f0e686202fbae`
BLAKE2b-256	`19143fd30c2354b8ab86720947f5059c1d4c09b9eb47fdadb9c131be6c1105af`

See more details on using hashes here.

voting-mcp 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

voting-mcp

Install

Tools

Benchmark

Results

Develop

Related research

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes