fedimpute is a benchmarking tool for federated imputation
Project description
FedImpute: a benchmarking and evaluation tool for federated imputation across various missing data scenarios.
FedImpute is a benchmarking tool for the evaluation of federated imputation algorithms over various missing data scenarios under horizontally partitioned data.
- Documentation: Documentation
- Paper: FedImpute
- Source Code: Source Code
- Benchmarking Analysis: FedImputeBench
Installation
Firstly, install python >= 3.10.0, we have two ways to install
Install from pip:
pip install fedimpute
Install from package repo:
git clone https://github.com/idsla/FedImpute
cd FedImpute
python -m venv ./venv
# window gitbash
source ./venv/Scripts/activate
# linux/unix
source ./venv/bin/activate
# Install the required packages
pip install -r requirements.txt
Basic Usage
Step 1. Prepare Data
import numpy as np
data = np.random.rand(10000, 10)
data_config = {
'task_type': 'regression',
'clf_type': None,
'num_cols': 9,
}
Step 2. Simulate Federated Missing Data Scenario
from fedimpute.simulator import Simulator
simulator = Simulator()
simulation_results = simulator.simulate_scenario(
data, data_config, num_clients = 10, dp_strategy='iid-even', ms_mech_type='mcar', verbose=1
)
Step 3. Execute Federated Imputation Algorithms
Note that if you use cuda version of torch, remember to set environment variable for cuda deterministic behavior
# bash (linux)
export CUBLAS_WORKSPACE_CONFIG=:4096:8
# powershell (windows)
$Env:CUBLAS_WORKSPACE_CONFIG = ":4096:8"
from fedimpute.execution_environment import FedImputeEnv
env = FedImputeEnv()
env.configuration(imputer = 'fed_ice', fed_strategy='fedavg', fit_mode = 'fed')
env.setup_from_simulator(simulator = simulator, verbose=1)
env.run_fed_imputation(run_type='sequential')
Step 4. Evaluate imputation outcomes
from fedimpute.evaluation import Evaluator
evaluator = Evaluator()
evaluator.evaluate(env, ['imp_quality', 'pred_downstream_local', 'pred_downstream_fed'])
evaluator.show_results()
Supported Data Partition Strategies
- Natural Partition: this can be done by reading list of datasets, see "Dataset and Preprocessing" section in documentation
- Artifical Partition
column
: partition based on discrete values of the column in the datasetiid-even
: iid partition with even sample sizesiid-dir
: iid parititon with sample sizes following dirichlet distributionniid-dir
: non-iid partition based on some columns with dirichlet ditributionniid-path
: non-iid partition based on some columns with pathological distribution (shard partition)
Supported Missing Data Mechanism
mcar
: MCAR missing mechanismmar-homo
: Homogeneous MAR missing mechansimmar-heter
: Heterogeneous MAR missing mechanismmnar-homo
: Homogeneours MNAR missing mechanismmnar-heter
: Heterogenous MNAR missing mechanism
Supported Federated Imputation Algorithms
Federated Imputation Algorithms:
Method | Type | Fed Strategy | Imputer (code) | Reference |
---|---|---|---|---|
Fed-Mean | Non-NN | - | fed_mean |
- |
Fed-EM | Non-NN | - | fed_em |
EM, FedEM |
Fed-ICE | Non-NN | - | fed_ice |
FedICE |
Fed-MissForest | Non-NN | - | fed_missforest |
MissForest, Fed Randomforest |
MIWAE | NN | fedavg , fedprox , fedavg_ft , ... |
miwae |
MIWAE |
GAIN | NN | fedavg , fedprox , fedavg_ft , ... |
gain |
GAIN |
Not-MIWAE | NN | fedavg , fedprox , fedavg_ft , ... |
notmiwae |
Not-MIWAE |
GNR | NN | fedavg , fedprox , fedavg_ft , ... |
gnr |
GNR |
Federated Strategies:
Method | Type | Fed_strategy(code) | Reference |
---|---|---|---|
FedAvg | global FL | fedavg |
FedAvg |
FedProx | global FL | fedprox |
FedProx |
Scaffold | global FL | scaffold |
Scaffold |
FedAdam | global FL | fedadam |
FedAdam |
FedAdagrad | global FL | fedadagrad |
FedAdaGrad |
FedYogi | global FL | fedyogi |
FedYogi |
FedAvg-FT | personalized FL | fedavg_ft |
FedAvg-FT |
FedImputeBench - Benckmarking Analysis Using FedImpute
We use FedImpute
to initialize a benchmarking analysis for federated imputation algorithms. The repo for FedImputeBench can be found here
Contact
For any questions, please contact Sitao Min
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file fedimpute-0.0.5.tar.gz
.
File metadata
- Download URL: fedimpute-0.0.5.tar.gz
- Upload date:
- Size: 114.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.11.4 Windows/10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ed2da6bc16cf91c369d3685a1742112966066205b818b5cc6c5db4085291e3b7 |
|
MD5 | 27820a4daf76c4e6d5d6ffd5e458e22e |
|
BLAKE2b-256 | 16d5ae266e6fe719f0cd5cdf7e35e86c754aa5c717b847c41e1c6dd049b60d8d |
File details
Details for the file fedimpute-0.0.5-py3-none-any.whl
.
File metadata
- Download URL: fedimpute-0.0.5-py3-none-any.whl
- Upload date:
- Size: 179.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.11.4 Windows/10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 30d7a28a9e879de007bb7a3d000ed4b9fc18e6e01d6a0d41270ab0928a7979fa |
|
MD5 | 636335c6ee963990a145d08a9c44fa74 |
|
BLAKE2b-256 | 4b3eca811d9220dc7c5529cdd5af78bc87c13983e388896a6fba90ab10d20735 |