Federated learning for Artificial Intelligence and Machine Learning library
Project description
FedArtML
Federated Learning for Artificial Intelligence and Machine Learning (FedArtML) is a Python-based software library publicly available on Pypi. The library aims to facilitate Federated Learning (FL) research and simplify the comparison between centralized Machine Learning and FL research results since it allows centralized datasets' partition in a systematic and controlled way regarding label, feature and quantity skewness. In addition, the library includes existing techniques for generating federated datasets in the relevant state-of-the-art and some other proposed by the authors. Moreover, it contains various metrics for quantifying the degree of non-IID (non-IID-ness) data residing across entities participating in decentralized data.
In this repository, you can find the library's source code, the installation command, some getting-started examples (including Jupyter Notebooks), and documentation regarding its use.
Enjoy it!
Installation
pip install fedartml
Get started
The following are examples to start using FedArtML to partition centralized data into Federated one considering label, feature and quantity skewness. You can also check broader guides to use this tool on the examples folder.
Label skew
Plotting an interactive stacked bar plot (with sliders) per each local node (client) and label's classes using the Dirichlet method.
from fedartml import InteractivePlots
from keras.datasets import cifar10
# Load CIFAR 10data
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
# Define (centralized) labels to use
CIFAR10_labels = y_train
# Instanciate InteractivePlots object
my_plot = InteractivePlots(labels = CIFAR10_labels)
# Show plot
my_plot.show_stacked_distr_dirichlet()
Creating federated data from centralized data using the Dirichlet method.
from fedartml import SplitAsFederatedData
from keras.datasets import cifar10
import numpy as np
# Define random state for reproducibility
random_state = 0
# Load data
(x_train_glob, y_train_glob), (x_test_glob, y_test_glob) = cifar10.load_data()
y_train_glob = np.reshape(y_train_glob, (y_train_glob.shape[0],))
y_test_glob = np.reshape(y_test_glob, (y_test_glob.shape[0],))
# Normalize pixel values to be between 0 and 1
x_train_glob, x_test_glob = x_train_glob / 255.0, x_test_glob / 255.0
# Instantiate a SplitAsFederatedData object
my_federater = SplitAsFederatedData(random_state = random_state)
# Get federated dataset from centralized dataset
clients_glob_dic, list_ids_sampled_dic, miss_class_per_node, distances = my_federater.create_clients(image_list = x_train_glob, label_list = y_train_glob,
num_clients = 2, prefix_cli='Local_node', method = "dirichlet", alpha = 1)
Feature skew
Creating federated data from centralized data using the Hist-Dirichlet-based method.
from fedartml import SplitAsFederatedData
from keras.datasets import cifar10
import numpy as np
# Define random state for reproducibility
random_state = 0
# Load data
(x_train_glob, y_train_glob), (x_test_glob, y_test_glob) = cifar10.load_data()
y_train_glob = np.reshape(y_train_glob, (y_train_glob.shape[0],))
y_test_glob = np.reshape(y_test_glob, (y_test_glob.shape[0],))
# Normalize pixel values to be between 0 and 1
x_train_glob, x_test_glob = x_train_glob / 255.0, x_test_glob / 255.0
# Instantiate a SplitAsFederatedData object
my_federater = SplitAsFederatedData(random_state = random_state)
# Get federated dataset from centralized dataset
clients_glob_dic, list_ids_sampled_dic, miss_class_per_node, distances = my_federater.create_clients(image_list = x_train_glob, label_list = y_train_glob,
num_clients = 2, prefix_cli='Local_node', method="no-label-skew", feat_skew_method="hist-dirichlet", alpha_feat_split = 1)
Quantity skew
Creating federated data from centralized data using the MinSize-Dirichlet method.
from fedartml import SplitAsFederatedData
from keras.datasets import cifar10
import numpy as np
# Define random state for reproducibility
random_state = 0
# Load data
(x_train_glob, y_train_glob), (x_test_glob, y_test_glob) = cifar10.load_data()
y_train_glob = np.reshape(y_train_glob, (y_train_glob.shape[0],))
y_test_glob = np.reshape(y_test_glob, (y_test_glob.shape[0],))
# Normalize pixel values to be between 0 and 1
x_train_glob, x_test_glob = x_train_glob / 255.0, x_test_glob / 255.0
# Instantiate a SplitAsFederatedData object
my_federater = SplitAsFederatedData(random_state = random_state)
# Get federated dataset from centralized dataset
clients_glob_dic, list_ids_sampled_dic, miss_class_per_node, distances = my_federater.create_clients(image_list = x_train_glob, label_list = y_train_glob,
num_clients = 2, prefix_cli='Local_node', method = "no-label-skew", quant_skew_method="minsize-dirichlet", alpha_quant_split=1)
Documentation
Find the documentation of the library on:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file FedArtML-0.1.33.tar.gz
.
File metadata
- Download URL: FedArtML-0.1.33.tar.gz
- Upload date:
- Size: 16.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 933bd0bee7460152922de9aee33628d126f0c164d4590fca36073a9b232195da |
|
MD5 | 798a828bcd7c5a514e08ebdd1a582e63 |
|
BLAKE2b-256 | b417c4bb765c53a5b197816a165df774a5b724efaa6f5736906ea20a3395c818 |
File details
Details for the file FedArtML-0.1.33-py3-none-any.whl
.
File metadata
- Download URL: FedArtML-0.1.33-py3-none-any.whl
- Upload date:
- Size: 17.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 69d98a8033cf839deffac436fe123759933cb56602dd9b57ae9876b82ed367c2 |
|
MD5 | 71f3aa109ef8f3a20f9a971a6070a65a |
|
BLAKE2b-256 | d6f28c62e039b19c4a6551867167389bc4d19cfc2f30cec8708d5b067bed7c81 |