Skip to main content

fedimpute is a benchmarking tool for federated imputation

Project description

FedImpute: a benchmarking and evaluation tool for federated imputation across various missing data scenarios.

License: GPL v3 Docs site

FedImpute is a benchmarking tool for the evaluation of federated imputation algorithms over various missing data scenarios under horizontally partitioned data.

Installation

Firstly, install python >= 3.10.0, we have two ways to install

Install from pip:

pip install fedimpute

Install from package repo:

git clone https://github.com/idsla/FedImpute
cd FedImpute

python -m venv ./venv

# window gitbash
source ./venv/Scripts/activate

# linux/unix
source ./venv/bin/activate

# Install the required packages
pip install -r requirements.txt

Basic Usage

Step 1. Prepare Data

import numpy as np
data = np.random.rand(10000, 10)
data_config = {
    'task_type': 'regression',
    'clf_type': None,
    'num_cols': 9,
}

Step 2. Simulate Federated Missing Data Scenario

from fedimpute.simulator import Simulator
simulator = Simulator()
simulation_results = simulator.simulate_scenario(
    data, data_config, num_clients = 10, dp_strategy='iid-even', ms_mech_type='mcar', verbose=1
)

Step 3. Execute Federated Imputation Algorithms

Note that if you use cuda version of torch, remember to set environment variable for cuda deterministic behavior

# bash (linux)
export CUBLAS_WORKSPACE_CONFIG=:4096:8
# powershell (windows)
$Env:CUBLAS_WORKSPACE_CONFIG = ":4096:8"
from fedimpute.execution_environment import FedImputeEnv
env = FedImputeEnv()
env.configuration(imputer = 'fed_ice', fed_strategy='fedavg', fit_mode = 'fed')
env.setup_from_simulator(simulator = simulator, verbose=1)

env.run_fed_imputation(run_type='sequential')

Step 4. Evaluate imputation outcomes

from fedimpute.evaluation import Evaluator

evaluator = Evaluator()
evaluator.evaluate(env, ['imp_quality', 'pred_downstream_local', 'pred_downstream_fed'])
evaluator.show_results()

Supported Federated Imputation Algorithms

Federated Imputation Algorithms:

Method Type Fed Strategy Imputer (code) Reference
Fed-Mean Non-NN - fed_mean -
Fed-EM Non-NN - fed_em EM, FedEM
Fed-ICE Non-NN - fed_ice FedICE
Fed-MissForest Non-NN - fed_missforest MissForest, Fed Randomforest
MIWAE NN fedavg, fedprox, fedavg_ft miwae MIWAE
GAIN NN fedavg, fedprox, fedavg_ft gain GAIN

Federated Strategies:

Method Type Fed_strategy(code) Reference
FedAvg global FL fedavg FedAvg
FedProx global FL fedprox FedProx
Scaffold global FL scaffold Scaffold
FedAdam global FL fedadam FedAdam
FedAdagrad global FL fedadagrad FedAdaGrad
FedYogi global FL fedyogi FedYogi
FedAvg-FT personalized FL fedavg_ft FedAvg-FT

FedImputeBench - Benckmarking Analysis Using FedImpute

We use FedImpute to initialize a benchmarking analysis for federated imputation algorithms. The repo for FedImputeBench can be found here

Contact

For any questions, please contact Sitao Min

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fedimpute-0.0.4.tar.gz (111.8 kB view hashes)

Uploaded Source

Built Distribution

fedimpute-0.0.4-py3-none-any.whl (172.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page