Skip to main content

NetFL is a framework for running Federated Learning (FL) experiments in simulated IoT and Fog/Edge computing environments, with support for heterogeneous devices and configurable network conditions.

Project description

NetFL

NetFL is a framework for executing Federated Learning (FL) experiments in simulated IoT and Fog/Edge computing environments. It enables the modeling of heterogeneous and resource-constrained scenarios, incorporating factors such as computational disparities among devices, limited bandwidth, latency, packet loss, and diverse network topologies.

Through its native abstractions for tasks, devices, and networks, NetFL allows researchers to configure and execute FL experiments in a declarative and reproducible manner, providing more realistic evaluations of algorithms under non-ideal, real-world conditions.

Under the hood, NetFL leverages Fogbed for distributed network emulation and Flower for federated learning orchestration. These libraries provide robust foundations for virtualization and FL training, and NetFL integrates and extends them into a unified framework designed specifically for FL research in IoT and Edge Computing.

Installation

Requirements: Ubuntu 22.04 LTS or later, Python 3.9.

1. Set up Containernet

Refer to the Containernet documentation for further details.

Install Ansible:

sudo apt-get install ansible

Clone the Containernet repository:

git clone https://github.com/containernet/containernet.git

Run the installation playbook:

sudo ansible-playbook -i "localhost," -c local containernet/ansible/install.yml

Create and activate a virtual environment:

python3 -m venv venv
source venv/bin/activate

Note: The virtual environment must be activated before installing or using any Python packages, including Containernet and NetFL.

Install Containernet into the active virtual environment:

pip install containernet/.

2. Install NetFL

While the virtual environment is still active, run:

pip install netfl

Running an Experiment with NetFL and Fogbed

Follow the steps below to set up and run an experiment using NetFL. This is an example using the MNIST dataset. You can find more examples in the examples folder:

1. Define the Dataset, Model, and Training Configurations

from typing import Any

import tensorflow as tf
from keras import models, optimizers
from flwr.server.strategy import Strategy, FedAvg

from netfl.core.task import Task, Dataset, DatasetInfo, DatasetPartitioner, TrainConfigs
from netfl.core.models import cnn3
from netfl.core.partitioners import IidPartitioner


class MNIST(Task):
    def dataset_info(self) -> DatasetInfo:
        return DatasetInfo(
            huggingface_path="ylecun/mnist",
            input_key="image",
            label_key="label",
            input_dtype=tf.float32,
            label_dtype=tf.int32,
        )

    def dataset_partitioner(self) -> DatasetPartitioner:
        return IidPartitioner()

    def preprocess_dataset(self, dataset: Dataset, training: bool) -> Dataset:
        return Dataset(x=tf.divide(dataset.x, 255.0), y=dataset.y)

    def model(self) -> models.Model:
        return cnn3(
            input_shape=(28, 28, 1),
            output_classes=10,
            optimizer=optimizers.SGD(learning_rate=0.01),
        )

    def aggregation_strategy(self) -> tuple[type[Strategy], dict[str, Any]]:
        return FedAvg, {}

    def train_configs(self) -> TrainConfigs:
        return TrainConfigs(
            batch_size=16,
            epochs=2,
            num_devices=4,
            num_partitions=4,
            num_rounds=10,
            seed_data=42,
            shuffle_data=True,
        )


class MainTask(MNIST):
    pass

2. Define the Experiment

Network Topology

from netfl.core.experiment import NetflExperiment
from netfl.utils.resources import (
    Host,
    NetworkResource,
    Resource,
    ClusterResource,
    ClusterResourceType,
    BASE_COMPUTE_UNIT,
)

from task import MainTask


task = MainTask()
train_configs = task.train_configs()

host = Host(cpu_clock=BASE_COMPUTE_UNIT)

server_resource = Resource(
    name="server",
    cpu_cores=1,
    cpu_clock=1.0,
    memory=1024,
    network=NetworkResource(bw=1000),
    host=host,
)

device_0_resource = Resource(
    name="device_0",
    cpu_cores=1,
    cpu_clock=0.25,
    memory=512,
    network=NetworkResource(bw=100),
    host=host,
)

device_1_resource = Resource(
    name="device_1",
    cpu_cores=1,
    cpu_clock=0.25,
    memory=512,
    network=NetworkResource(bw=50),
    host=host,
)

cloud_resource = ClusterResource(
    name="cloud", type=ClusterResourceType.CLOUD, resources=[server_resource]
)

edge_0_resource = ClusterResource(
    name="edge_0",
    type=ClusterResourceType.EDGE,
    resources=(train_configs.num_devices // 2) * [device_0_resource],
)

edge_1_resource = ClusterResource(
    name="edge_1",
    type=ClusterResourceType.EDGE,
    resources=(train_configs.num_devices // 2) * [device_1_resource],
)

exp = NetflExperiment(
    name="mnist-exp",
    task=task,
    resources=[cloud_resource, edge_0_resource, edge_1_resource],
)

cloud = exp.create_cluster(cloud_resource)
edge_0 = exp.create_cluster(edge_0_resource)
edge_1 = exp.create_cluster(edge_1_resource)

server = exp.create_server(server_resource)
edge_0_devices = exp.create_devices(device_0_resource, edge_0_resource.num_resources)
edge_1_devices = exp.create_devices(device_1_resource, edge_1_resource.num_resources)

exp.add_to_cluster(server, cloud)
for device in edge_0_devices:
    exp.add_to_cluster(device, edge_0)
for device in edge_1_devices:
    exp.add_to_cluster(device, edge_1)

worker = exp.add_worker("127.0.0.1")
worker.add(cloud)
worker.add(edge_0)
worker.add(edge_1)
worker.add_link(cloud, edge_0, **NetworkResource(bw=10).link_params)
worker.add_link(cloud, edge_1, **NetworkResource(bw=5).link_params)

try:
    exp.start()
except Exception as ex:
    print(ex)
finally:
    exp.stop()

When the host cpu_clock is set to BASE_COMPUTE_UNIT, all resource cpu_clock values are interpreted in Docker CPU units (e.g., millicores) instead of GHz.

3. Start Fogbed Worker and Run the Experiment

RunWorker -p=5000
python3 experiment.py

Refer to the Fogbed documentation for detailed instructions on starting workers.

Running a Simple Example with a Basic Network Topology Using Docker Compose

1. Clone the repository

https://github.com/larsid/netfl.git

2. Create the Task

In the project root directory, create or modify a NetFL Task and name the file task.py. Refer to the examples in the examples folder for guidance on task creation.

3. Create the Infrastructure

Use Docker Compose to set up the infrastructure, including the server and clients:

docker compose up -d

4. View Training Results

To check the server logs, run:

docker logs server

Training logs are also stored in the logs folder within the project root directory.

5. Shut Down the Infrastructure

To stop and remove all running containers, use the following command:

docker compose down

More information

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

netfl-1.2.0.tar.gz (116.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

netfl-1.2.0-py3-none-any.whl (24.2 kB view details)

Uploaded Python 3

File details

Details for the file netfl-1.2.0.tar.gz.

File metadata

  • Download URL: netfl-1.2.0.tar.gz
  • Upload date:
  • Size: 116.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for netfl-1.2.0.tar.gz
Algorithm Hash digest
SHA256 1b9a9f6e24f8adf4a7694906aa1a416b6bea9fa52f10788b195c4f43fdabc37e
MD5 25da0db72572a4eabea793518736b511
BLAKE2b-256 da480c9888c6dbfe79dd0a749c5cec89cc35c622848572d3523e06241891583b

See more details on using hashes here.

Provenance

The following attestation bundles were made for netfl-1.2.0.tar.gz:

Publisher: pypi.yml on larsid/netfl

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file netfl-1.2.0-py3-none-any.whl.

File metadata

  • Download URL: netfl-1.2.0-py3-none-any.whl
  • Upload date:
  • Size: 24.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for netfl-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 03491d347fc94f5f51ed9cc987833bc95225ffc710636ed2b797a8a9678ceefa
MD5 4bd794a606d17de2e21b87dfd2d4c91f
BLAKE2b-256 90f81f67d6aca8ec844c2202f093c454a2a4e6bdcfdcc208028f08d2de1d1827

See more details on using hashes here.

Provenance

The following attestation bundles were made for netfl-1.2.0-py3-none-any.whl:

Publisher: pypi.yml on larsid/netfl

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page