Skip to main content

Federated learning for Artificial Intelligence and Machine Learning library

Project description

FedArtML

Federated Learning for Artificial Intelligence and Machine Learning (FedArtML) is a Python-based software library publicly available on Pypi. The library aims to facilitate Federated Learning (FL) research and simplify the comparison between centralized Machine Learning and FL research results since it allows centralized datasets' partition in a systematic and controlled way regarding label, feature and quantity skewness. In addition, the library includes existing techniques for generating federated datasets in the relevant state-of-the-art and some other proposed by the authors. Moreover, it contains various metrics for quantifying the degree of non-IID (non-IID-ness) data residing across entities participating in decentralized data.

In this repository, you can find the library's source code, the installation command, some getting-started examples (including Jupyter Notebooks), and documentation regarding its use.

Enjoy it!

Installation


pip install fedartml

Get started

The following are examples to start using FedArtML to partition centralized data into Federated one considering label, feature and quantity skewness. You can also check broader guides to use this tool on the examples folder.

Label skew

Plotting an interactive stacked bar plot (with sliders) per each local node (client) and label's classes using the Dirichlet method.

from fedartml import InteractivePlots

from keras.datasets import cifar10



# Load CIFAR 10data

(x_train, y_train), (x_test, y_test) = cifar10.load_data()



# Define (centralized) labels to use 

CIFAR10_labels = y_train



# Instanciate InteractivePlots object

my_plot = InteractivePlots(labels = CIFAR10_labels)



# Show plot

my_plot.show_stacked_distr_dirichlet()

Creating federated data from centralized data using the Dirichlet method.

from fedartml import SplitAsFederatedData

from keras.datasets import cifar10

import numpy as np



# Define random state for reproducibility

random_state = 0



# Load data

(x_train_glob, y_train_glob), (x_test_glob, y_test_glob) = cifar10.load_data()

y_train_glob = np.reshape(y_train_glob, (y_train_glob.shape[0],))

y_test_glob = np.reshape(y_test_glob, (y_test_glob.shape[0],))



# Normalize pixel values to be between 0 and 1

x_train_glob, x_test_glob = x_train_glob / 255.0, x_test_glob / 255.0



# Instantiate a SplitAsFederatedData object

my_federater = SplitAsFederatedData(random_state = random_state)



# Get federated dataset from centralized dataset

clients_glob_dic, list_ids_sampled_dic, miss_class_per_node, distances = my_federater.create_clients(image_list = x_train_glob, label_list = y_train_glob, 

                                                             num_clients = 2, prefix_cli='Local_node', method = "dirichlet", alpha = 1)

Feature skew

Creating federated data from centralized data using the Hist-Dirichlet-based method.

from fedartml import SplitAsFederatedData

from keras.datasets import cifar10

import numpy as np



# Define random state for reproducibility

random_state = 0



# Load data

(x_train_glob, y_train_glob), (x_test_glob, y_test_glob) = cifar10.load_data()

y_train_glob = np.reshape(y_train_glob, (y_train_glob.shape[0],))

y_test_glob = np.reshape(y_test_glob, (y_test_glob.shape[0],))



# Normalize pixel values to be between 0 and 1

x_train_glob, x_test_glob = x_train_glob / 255.0, x_test_glob / 255.0



# Instantiate a SplitAsFederatedData object

my_federater = SplitAsFederatedData(random_state = random_state)



# Get federated dataset from centralized dataset

clients_glob_dic, list_ids_sampled_dic, miss_class_per_node, distances = my_federater.create_clients(image_list = x_train_glob, label_list = y_train_glob, 

                                                             num_clients = 2, prefix_cli='Local_node', method="no-label-skew", feat_skew_method="hist-dirichlet", alpha_feat_split = 1)

Quantity skew

Creating federated data from centralized data using the MinSize-Dirichlet method.

from fedartml import SplitAsFederatedData

from keras.datasets import cifar10

import numpy as np



# Define random state for reproducibility

random_state = 0



# Load data

(x_train_glob, y_train_glob), (x_test_glob, y_test_glob) = cifar10.load_data()

y_train_glob = np.reshape(y_train_glob, (y_train_glob.shape[0],))

y_test_glob = np.reshape(y_test_glob, (y_test_glob.shape[0],))



# Normalize pixel values to be between 0 and 1

x_train_glob, x_test_glob = x_train_glob / 255.0, x_test_glob / 255.0



# Instantiate a SplitAsFederatedData object

my_federater = SplitAsFederatedData(random_state = random_state)



# Get federated dataset from centralized dataset

clients_glob_dic, list_ids_sampled_dic, miss_class_per_node, distances = my_federater.create_clients(image_list = x_train_glob, label_list = y_train_glob, 

                                                             num_clients = 2, prefix_cli='Local_node', method = "no-label-skew", quant_skew_method="minsize-dirichlet", alpha_quant_split=1)

Documentation

Find the documentation of the library on:

https://fedartml.readthedocs.io/en/latest/index.html#

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

FedArtML-0.1.33.tar.gz (16.2 kB view hashes)

Uploaded Source

Built Distribution

FedArtML-0.1.33-py3-none-any.whl (17.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page