Skip to main content

A tool for benchmarking vertical federated learning algorithms, containing synthetic data split and data evaluation.

Project description

VertiBench: Vertical Federated Learning Benchmark

Introduction

VertiBench is a benchmark for federated learning, split learning, and assisted learning on vertical partitioned data. It provides tools to synthetic vertical partitioned data from a given global dataset. VertiBench supports partition under various imbalance and correlation level, effectively simulating a wide-range of real-world vertical federated learning scenarios.

data-dist-full.png

Installation

VertiBench has already been published on PyPI. The installation requires the installation of python>=3.9. To further install VertiBench, run the following command:

pip install vertibench

Getting Started

This examples includes the pipeline of split and evaluate. First, load your datasets or generate synthetic datasets.

from sklearn.datasets import make_classification

# Generate a large dataset
X, y = make_classification(n_samples=10000, n_features=10)

To split the dataset by importance,

from vertibench.Splitter import ImportanceSplitter

imp_splitter = ImportanceSplitter(num_parties=4, weights=[1, 1, 1, 3])
Xs = imp_splitter.split(X)

To split the dataset by correlation,

from vertibench.Splitter import CorrelationSplitter

corr_splitter = CorrelationSplitter(num_parties=4)
Xs = corr_splitter.fit_split(X)

To evaluate a feature split Xs in terms of party importance,

from vertibench.Evaluator import ImportanceEvaluator
from sklearn.linear_model import LogisticRegression
import numpy as np

model = LogisticRegression()
X = np.concatenate(Xs, axis=1)
model.fit(X, y)
imp_evaluator = ImportanceEvaluator()
imp_scores = imp_evaluator.evaluate(Xs, model.predict)
alpha = imp_evaluator.evaluate_alpha(scores=imp_scores)
print(f"Importance scores: {imp_scores}, alpha: {alpha}")

To evaluate a feature split in terms of correlation,

from vertibench.Evaluator import CorrelationEvaluator

corr_evaluator = CorrelationEvaluator()
corr_scores = corr_evaluator.fit_evaluate(Xs)
beta = corr_evaluator.evaluate_beta()
print(f"Correlation scores: {corr_scores}, beta: {beta}")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vertibench-0.1.1.tar.gz (28.3 kB view details)

Uploaded Source

Built Distribution

vertibench-0.1.1-py3-none-any.whl (22.4 kB view details)

Uploaded Python 3

File details

Details for the file vertibench-0.1.1.tar.gz.

File metadata

  • Download URL: vertibench-0.1.1.tar.gz
  • Upload date:
  • Size: 28.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.8

File hashes

Hashes for vertibench-0.1.1.tar.gz
Algorithm Hash digest
SHA256 ee1523c9289baa96e8c9f0b6be9eb3da5a1215c41b6857330b76d3b20d0f4996
MD5 59d8c95656006810dc5a580ba808934a
BLAKE2b-256 46153371a3438264a83e969138e257d547ecd5401d97c3564995478fae28dcb2

See more details on using hashes here.

Provenance

File details

Details for the file vertibench-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: vertibench-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 22.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.8

File hashes

Hashes for vertibench-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5a5e74a24aa0a3d41be7dfe7cea8ca850b669d87eda6ae7f66a34eb5275794e2
MD5 4399fa322ef958dc92deef0a8ce14720
BLAKE2b-256 a905f394119ac3ad9b944cb996a134e748a360d1100980ec2ebbce8ca6e5bf4e

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page