Skip to main content

A comprehensive Python package for calculating Shapley values in cooperative game theory

Project description

Shapley Value Calculator

PyPI Downloads License: MIT Python 3.8+ Project Page

A comprehensive Python package for calculating Shapley values in cooperative game theory. Shapley values provide a mathematically principled method for fairly distributing payoffs among players based on their marginal contributions to coalitions.

📖 Table of Contents

🎯 Overview

The Shapley value is a solution concept from cooperative game theory that assigns a unique distribution (among the players) of a total surplus generated by the coalition of all players. This package provides four approaches:

Class When to use Scales to
ShapleyCombinations Pre-defined coalition values (exact) ~15 players
ShapleyValue Pre-defined coalition values, alt algorithm ~15 players
ShapleyValueCalculator Dynamic evaluation function (exact, parallel) ~20 players
MonteCarloShapleyValue Dynamic evaluation function (approximate, parallel) 100+ players

🚀 Installation

pip install shapley-value

⚡ Quick Start

Exact calculation from pre-defined coalition values

from shapley_value import ShapleyCombinations

players = ['Alice', 'Bob', 'Charlie']
coalition_values = {
    (): 0,
    ('Alice',): 10,  ('Bob',): 20,  ('Charlie',): 30,
    ('Alice', 'Bob'): 50,  ('Alice', 'Charlie'): 60,  ('Bob', 'Charlie'): 70,
    ('Alice', 'Bob', 'Charlie'): 100,
}

calculator = ShapleyCombinations(players)
shapley_values = calculator.calculate_shapley_values(coalition_values)
# {'Alice': 16.67, 'Bob': 33.33, 'Charlie': 50.0}

Monte Carlo approximation for large games

from shapley_value import MonteCarloShapleyValue

def my_model(coalition):
    """Any callable – ML model, simulation, or analytic function."""
    return sum(coalition) ** 0.9 if coalition else 0.0

mc = MonteCarloShapleyValue(
    my_model,
    players=list(range(50)),   # 50 players – exact is infeasible
    num_samples=5000,
    random_seed=42,
    n_jobs=-1,                 # use all CPU cores
)
values = mc.calculate_shapley_values()   # Dict[player, float]

📋 Usage Examples

Method 1 – Pre-defined coalition values (ShapleyCombinations)

Use this when every coalition value is already known:

from shapley_value import ShapleyCombinations

players = ['Player1', 'Player2', 'Player3']
coalition_values = {
    (): 0,
    ('Player1',): 100,  ('Player2',): 200,  ('Player3',): 300,
    ('Player1', 'Player2'): 450,
    ('Player1', 'Player3'): 500,
    ('Player2', 'Player3'): 600,
    ('Player1', 'Player2', 'Player3'): 900,
}

calculator = ShapleyCombinations(players)
shapley_values = calculator.calculate_shapley_values(coalition_values)
print(shapley_values)

Method 2 – Exact evaluation function (ShapleyValueCalculator)

Use this when a callable computes the coalition value and you have ≤ ~20 players:

from shapley_value import ShapleyValueCalculator

def profit_function(coalition):
    if not coalition:
        return 0
    return sum(coalition) + len(coalition) * 10  # synergy bonus

players = [100, 200, 300]

calculator = ShapleyValueCalculator(profit_function, players, n_jobs=-1)
shapley_values = calculator.calculate_shapley_values()
print(shapley_values)

raw_data = calculator.get_raw_data()           # detailed per-coalition data
calculator.save_raw_data('analysis.csv')       # export to CSV

n_jobs follows the scikit-learn convention: 1 = sequential, -1 = all cores, k = exactly k cores.

Method 3 – Monte Carlo approximation (MonteCarloShapleyValue)

Use this for large games (20+ players) where exact computation is infeasible. Complexity is O(m × n) where m = num_samples and n = number of players, versus O(2ⁿ) for exact methods.

from shapley_value import MonteCarloShapleyValue

def complex_model(coalition):
    """Expensive callable – e.g. a trained ML model."""
    if not coalition:
        return 0.0
    return float(sum(p ** 1.2 for p in coalition))

players = list(range(1, 51))  # 50 players

mc = MonteCarloShapleyValue(
    complex_model,
    players=players,
    num_samples=5000,   # more samples → lower error
    random_seed=0,      # set for reproducibility
    n_jobs=-1,          # parallelise across all CPU cores
)

# Core result
values = mc.calculate_shapley_values()

# Diagnose how many samples you need
convergence_df = mc.get_convergence_data()  # DataFrame shape (5000, 50)

# Inspect per-permutation marginal contributions
raw_df = mc.get_raw_data()  # columns: iteration, permutation, player, marginal_contribution

Choosing num_samples

Players Suggested num_samples Typical error
≤ 10 1 000 < 0.5 %
10–30 5 000 < 1 %
30–100 10 000–50 000 1–3 %

Use get_convergence_data() to plot running estimates and verify convergence for your specific game.

n_jobs quick reference

# Sequential (default – no overhead, good for small games / debugging)
mc = MonteCarloShapleyValue(f, players, n_jobs=1)

# All available CPU cores
mc = MonteCarloShapleyValue(f, players, n_jobs=-1)

# Exactly 4 cores
mc = MonteCarloShapleyValue(f, players, n_jobs=4)

Note: Permutations are generated in a fixed order before the parallel step, so random_seed guarantees bit-identical results regardless of n_jobs.

📚 API Reference

ShapleyCombinations

class ShapleyCombinations:
    def __init__(self, players: List[Any])
    def calculate_shapley_values(
        self, coalition_values: Dict[Tuple, float]
    ) -> Dict[Any, float]
    def get_all_coalitions(self) -> List[Tuple]

ShapleyValueCalculator

class ShapleyValueCalculator:
    def __init__(
        self,
        evaluation_function: Callable[[List[Any]], float],
        players: List[Any],
        n_jobs: int = 1,   # sklearn-style: 1=seq (default), -1=all cores, k=k cores
    )
    def calculate_shapley_values(self) -> Dict[Any, float]
    def get_raw_data(self) -> pd.DataFrame
    def save_raw_data(self, file_path: str) -> None

MonteCarloShapleyValue

class MonteCarloShapleyValue:
    def __init__(
        self,
        evaluation_function: Callable[[List[Any]], float],
        players: List[Any],
        num_samples: int = 1000,     # permutations to sample
        random_seed: Optional[int] = None,
        n_jobs: int = 1,             # sklearn-style parallelism
    )
    def calculate_shapley_values(self) -> Dict[Any, float]
    def get_convergence_data(self) -> pd.DataFrame   # running estimates per iteration
    def get_raw_data(self) -> pd.DataFrame           # per-permutation detail

ShapleyValue

Low-level calculator using the weighted-marginal-contribution formula directly.

class ShapleyValue:
    def __init__(
        self,
        players: List[Any],
        coalition_values: Dict[Tuple[Any, ...], float],
    )
    def calculate_shapley_values(self) -> Dict[Any, float]

✨ Features

  • Four calculation methods – exact (pre-defined values, evaluation function) and Monte Carlo approximation for large games
  • sklearn-style n_jobs on both ShapleyValueCalculator and MonteCarloShapleyValue: 1 = sequential, -1 = all cores, k = exactly k cores
  • Reproducible parallel resultsrandom_seed in MonteCarloShapleyValue gives bit-identical output regardless of n_jobs
  • Convergence diagnosticsget_convergence_data() lets you plot running estimates and tune num_samples for your accuracy target
  • Data exportget_raw_data() and save_raw_data() for downstream analysis
  • Type flexibility – any hashable player type (strings, ints, tuples, …)
  • Comprehensive test suite – 53 tests covering correctness, reproducibility, parallelism, edge cases, and stress / performance

⚡ Performance

Exact methods (ShapleyCombinations / ShapleyValueCalculator)

Exact computation enumerates all 2ⁿ coalitions and becomes impractical beyond ~20 players.

Players Coalitions Sequential Parallel (n_jobs=-1) Speedup
5 32 < 0.001 s < 0.001 s
10 1 024 ~ 6.6 s ~ 1.1 s ~ 6×
12 4 096 ~ 31 s ~ 2.6 s ~ 12×
15 32 768 ~ 4 min ~ 20 s ~ 12×

Monte Carlo (MonteCarloShapleyValue)

O(m × n) cost – scales to 100+ players. Parallelism is beneficial when the evaluation function is expensive (e.g. ML model inference, simulations).

Players num_samples n_jobs=1 n_jobs=-1 Notes
20 5 000 < 1 s < 1 s cheap eval function
50 5 000 < 1 s < 1 s cheap eval function
50 10 000 ~ 2 s ~ 0.7 s expensive eval (~1 ms/call)
100 10 000 ~ 5 s ~ 1.5 s expensive eval (~1 ms/call)

Rule of thumb: if your evaluation function is cheap (< 10 µs), use n_jobs=1 to avoid joblib process-spawn overhead. If it's expensive (ML model, simulation), set n_jobs=-1 for maximum throughput.

🧪 Testing

# Install dev dependencies
pip install -e ".[dev,examples,performance]"

# Run the full test suite (53 tests)
python -m pytest tests/ -v

# Run only stress / performance tests (with printed benchmark table)
python -m pytest tests/test_montecarlo_stress.py -v -s

The suite is organised into:

File Tests Coverage
test_calculator.py 11 ShapleyValue, ShapleyCombinations, ShapleyValueCalculator
test_montecarlo.py 29 Correctness, reproducibility, parallelism, edge cases
test_montecarlo_stress.py 13 20–100 player stress, timing bounds, benchmark table

🤝 Contributing

We welcome contributions!

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Make your changes and add tests
  4. Commit your changes (git commit -m 'Add amazing feature')
  5. Push to the branch (git push origin feature/amazing-feature)
  6. Open a Pull Request

Development Setup

git clone https://github.com/Bowenislandsong/shapley-value.git
cd shapley-value
pip install -e ".[dev,examples,performance]"
python -m pytest tests/

📄 License

This project is licensed under the MIT License – see the LICENSE file for details.

📝 Citation

If you use this package in your research or project, please cite it as:

@software{song2026shapley,
  author = {Song, Bowen},
  title = {Shapley Value Calculator},
  year = {2026},
  publisher = {GitHub},
  url = {https://github.com/Bowenislandsong/shapley-value},
  version = {0.0.9}
}

APA Format:

Song, B. (2026). Shapley Value Calculator (Version 0.0.9) [Computer software]. https://github.com/Bowenislandsong/shapley-value

For more citation formats see CITATION.cff.

🏷️ Version History

  • 0.0.9: PyPI publish on v* tag push; citation / version metadata
  • 0.0.8: ShapleyValueCalculator uses n_jobs (sklearn-style); doc and landing-page sync; example runner includes Monte Carlo example
  • 0.0.7: MonteCarloShapleyValue with sklearn-style n_jobs; comprehensive stress tests; convergence and raw-data diagnostics
  • 0.0.6: Citation information for academic use
  • 0.0.5: CI improvements and Python 3.13 support
  • 0.0.4: Comprehensive examples, parallel processing, performance optimizations
  • 0.0.3: Enhanced parallel processing
  • 0.0.2: Function-based evaluation and data export
  • 0.0.1: Initial release

👥 Authors

🔗 Links


For more information about Shapley values and cooperative game theory, see the Wikipedia article.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

shapley_value-0.0.9.tar.gz (52.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

shapley_value-0.0.9-py3-none-any.whl (15.8 kB view details)

Uploaded Python 3

File details

Details for the file shapley_value-0.0.9.tar.gz.

File metadata

  • Download URL: shapley_value-0.0.9.tar.gz
  • Upload date:
  • Size: 52.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for shapley_value-0.0.9.tar.gz
Algorithm Hash digest
SHA256 8f973beb943c8fdbc08009b7e8b6955a9a4e29e901578f48a0f134acd62a2059
MD5 76ca1728bc8b0eacb0800f3096e16277
BLAKE2b-256 70b90686b55f70b46e443c5538bebf497534e16ac1955b6c871f85469ca16893

See more details on using hashes here.

File details

Details for the file shapley_value-0.0.9-py3-none-any.whl.

File metadata

  • Download URL: shapley_value-0.0.9-py3-none-any.whl
  • Upload date:
  • Size: 15.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for shapley_value-0.0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 c0d59f34c0ec9598fa8893ed3103ed1ee567797b98ecc3a07c9dc0f6d9314c55
MD5 aa4f550e53c11cb29845c57c0906f128
BLAKE2b-256 cda7d765ab952bb47a504f21ad728872c16c2719d7489819b49767f2dce15fe2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page