A Python library for rapid prototyping, experimenting, and logging of federated learning using state-of-the-art models and datasets. Built using PyTorch and PyTorch Lightning.
Project description
Table of Contents
- Key Features
- Installation
- Examples and Usage
- Available Models
- Available Datasets
- Contributing
- Citation
Features
- Python 3.6+ support. Built using
torch-1.10.1
,torchvision-0.11.2
, andpytorch-lightning-1.5.7
. - Customizable implementations for state-of-the-art deep learning models which can be trained in federated or non-federated settings.
- Supports finetuning of the pre-trained deep learning models, allowing for faster training using transfer learning.
- PyTorch LightningDataModule wrappers for the most commonly used datasets to reduce the boilerplate code before experiments.
- Built using the bottom-up approach for the datamodules and models which ensures abstractions while allowing for customization.
- Provides implementation of the federated learning (FL) samplers, aggregators, and wrappers, to prototype FL experiments on-the-go.
- Backwards compatible with the PyTorch LightningDataModule, LightningModule, loggers, and DevOps tools.
- More details about the examples and usage can be found below.
Installation
Stable Release
As of now, candlefl
is available on PyPI and can be installed using the following command in your terminal:
$ pip install candlefl
This is the preferred method to install candlefl
with the most stable release.
If you don't have pip installed, this Python installation guide can guide you through the process.
Examples and Usage
Although candlefl
is primarily built for quick prototyping of federated learning experiments, the models, datasets, and abstractions can also speed up the non-federated learning experiments. In this section, we will explore examples and usages under both the settings.
Non-Federated Learning
The following steps should be followed on a high-level to train a non-federated learning experiment. We are using the EMNIST (MNIST)
dataset and densenet121
for this example.
-
Import the relevant modules.
from candlefl.datamodules.emnist import EMNISTDataModule from candlefl.models.wrapper.emnist import MNISTEMNIST
import pytorch_lightning as pl from pytorch_lightning.loggers import TensorBoardLogger from pytorch_lightning.callbacks import ( ModelCheckpoint, LearningRateMonitor, DeviceStatsMonitor, ModelSummary, ProgressBar, ... )
For more details, view the full list of PyTorch Lightning callbacks and loggers on the official website.
-
Setup the PyTorch Lightning trainer.
trainer = pl.Trainer( ... logger=[ TensorBoardLogger( name=experiment_name, save_dir=os.path.join(checkpoint_save_path, experiment_name), ) ], callbacks=[ ModelCheckpoint(save_weights_only=True, mode="max", monitor="val_acc"), LearningRateMonitor("epoch"), DeviceStatsMonitor(), ModelSummary(), ProgressBar(), ], ... )
More details about the PyTorch Lightning Trainer API can be found on their official website.
-
Prepare the dataset using the wrappers provided by
candlefl.datamodules
.datamodule = EMNISTDataModule(dataset_name="mnist") datamodule.prepare_data() datamodule.setup()
-
Initialize the model using the wrappers provided by
candlefl.models.wrappers
.# check if the model can be loaded from a given checkpoint if (checkpoint_load_path) and os.path.isfile(checkpoint_load_path): model = MNISTEMNIST( "densenet121", "adam", {"lr": 0.001} ).load_from_checkpoint(checkpoint_load_path) else: pl.seed_everything(42) model = MNISTEMNIST("densenet121", "adam", {"lr": 0.001}) trainer.fit(model, datamodule.train_dataloader(), datamodule.val_dataloader())
-
Collect the results.
val_result = trainer.test( model, test_dataloaders=datamodule.val_dataloader(), verbose=True ) test_result = trainer.test( model, test_dataloaders=datamodule.test_dataloader(), verbose=True )
-
The corresponding files for the experiment (model checkpoints and logger metadata) will be stored at
default_root_dir
argument given to the PyTorch LightningTrainer
object in Step 2. For this experiment, we use the Tensorboard logger. To view the logs (and related plots and metrics), go to thedefault_root_dir
path and find the Tensorboard log files. Upload the files to the Tensorboard Development portal following the instructions here. Once the log files are uploaded, a unique url to your experiment will be generated which can be shared with ease! An example can be found here. -
Note that,
candlefl
is compatible with all the loggers supported by PyTorch Lightning. More information about the PyTorch Lightning loggers can be found here.
Federated Learning
The following steps should be followed on a high-level to train a federated learning experiment.
- Pick a dataset and use the
datamodules
to create federated data shards with iid or non-iid distribution.def get_datamodule() -> EMNISTDataModule: datamodule: EMNISTDataModule = EMNISTDataModule( dataset_name=SUPPORTED_DATASETS_TYPE.MNIST, train_batch_size=10 ) datamodule.prepare_data() datamodule.setup() return datamodule agent_data_shard_map = get_agent_data_shard_map().federated_iid_dataloader( num_workers=fl_params.num_agents, workers_batch_size=fl_params.local_train_batch_size, )
- Use the TorchFL
agents
module and themodels
module to initialize the global model, agents, and distribute their models.def initialize_agents( fl_params: FLParams, agent_data_shard_map: Dict[int, DataLoader] ) -> List[V1Agent]: """Initialize agents.""" agents = [] for agent_id in range(fl_params.num_agents): agent = V1Agent( id=agent_id, model=MNISTEMNIST( model_name=EMNIST_MODELS_ENUM.MOBILENETV3SMALL, optimizer_name=OPTIMIZERS_TYPE.ADAM, optimizer_hparams={"lr": 0.001}, model_hparams={"pre_trained": True, "feature_extract": True}, fl_hparams=fl_params, ), data_shard=agent_data_shard_map[agent_id], ) agents.append(agent) return agents global_model = MNISTEMNIST( model_name=EMNIST_MODELS_ENUM.MOBILENETV3SMALL, optimizer_name=OPTIMIZERS_TYPE.ADAM, optimizer_hparams={"lr": 0.001}, model_hparams={"pre_trained": True, "feature_extract": True}, fl_hparams=fl_params, ) all_agents = initialize_agents(fl_params, agent_data_shard_map)
- Initiliaze an
FLParam
object with the desired FL hyperparameters and pass it on to theEntrypoint
object which will abstract the training.fl_params = FLParams( experiment_name="iid_mnist_fedavg_10_agents_5_sampled_50_epochs_mobilenetv3small_latest", num_agents=10, global_epochs=10, local_epochs=2, sampling_ratio=0.5, ) entrypoint = Entrypoint( global_model=global_model, global_datamodule=get_agent_data_shard_map(), fl_hparams=fl_params, agents=all_agents, aggregator=FedAvgAggregator(all_agents=all_agents), sampler=RandomSampler(all_agents=all_agents), ) entrypoint.run()
Available Models
For the initial release, candlefl
will only support state-of-the-art computer vision models. The following table summarizes the available models, support for pre-training, and the possibility of feature-extracting. Please note that the models have been tested with all the available datasets. Therefore, the link to the tests will be provided in the next section.
Available Datasets
Following datasets have been wrapped inside a LightningDataModule
and made available for the initial release of candlefl
. To add a new dataset, check the source code in candlefl.datamodules
, add tests, and create a PR with Features
tag.
Contributing
Contributions are welcome, and they are greatly appreciated! Every little bit helps, and credit will always be given.
You can contribute in many ways:
Types of Contributions
Report Bugs
If you are reporting a bug, please include:
- Your operating system name and version.
- Any details about your local setup that might be helpful in troubleshooting.
- Detailed steps to reproduce the bug.
Fix Bugs
Look through the GitHub issues for bugs. Anything tagged with "bug" and "help wanted" is open to whoever wants to implement it.
Implement Features
Look through the GitHub issues for features. Anything tagged with "enhancement", "help wanted", "feature" is open to whoever wants to implement it.
Write Documentation
candlefl
could always use more documentation, whether as part of the official candlefl docs, in docstrings, or even on the web in blog posts, articles, and such.
Submit Feedback
If you are proposing a feature:
- Explain in detail how it would work.
- Keep the scope as narrow as possible, to make it easier to implement.
- Remember that this is a volunteer-driven project, and that contributions are welcome :)
Get Started
Ready to contribute? Here's how to set up candlefl for local development.
-
Fork the candlefl repo on GitHub.
-
Clone your fork locally:
$ git clone git@github.com:<your_username_here>/candlefl.git
-
Install Poetry to manage dependencies and virtual environments from https://python-poetry.org/docs/.
-
Install the project dependencies using:
$ poetry install
-
To add a new dependency to the project, use:
$ poetry add <dependency_name>
-
Create a branch for local development:
$ git checkout -b name-of-your-bugfix-or-feature
Now you can make your changes locally and maintain them on your own branch.
-
When you're done making changes, check that your changes pass the tests:
$ poetry run pytest tests
If you want to run a specific test file, use:
$ poetry pytest <path-to-the-file>
If your changes are not covered by the tests, please add tests.
-
The pre-commit hooks will be run before every commit. If you want to run them manually, use:
$ pre-commit run --all
-
Commit your changes and push your branch to GitHub:
$ git add --all $ git commit -m "Your detailed description of your changes." $ git push origin <name-of-your-bugfix-or-feature>
-
Submit a pull request through the Github web interface.
-
Once the pull request has been submitted, the continuous integration pipelines on Github Actions will be triggered. Ensure that all of them pass before one of the maintainers can review the request.
Pull Request Guidelines
Before you submit a pull request, check that it meets these guidelines:
- The pull request should include tests.
- Try adding new test cases for new features or enhancements and make changes to the CI pipelines accordingly.
- Modify the existing tests (if required) for the bug fixes.
- If the pull request adds functionality, the docs should be updated. Put your new functionality into a function with a docstring, and add the feature to the list in
README.md
. - The pull request should pass all the existing CI pipelines (Github Actions) and the new/modified workflows should be added as required.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.