FLIP - Federated Learning for Imaging Platform library built on NVIDIA FLARE
Project description
flip-fl-base
This repository contains the FLIP federated learning base application built on NVIDIA FLARE (NVFLARE). It includes the FL services (server, clients, admin API) and the base application code that users extend with their own training logic.
Quick Start
Prerequisites
- Docker and Docker Compose
- uv (Python package manager)
- AWS CLI configured (for downloading test data)
1. Provision an FL Network
Before running anything, you need to provision a federated learning network. This generates the required certificates, keys, and configuration files:
make nvflare-provision NET_NUMBER=1
This creates:
- Network-specific compose file:
deploy/compose-net-1.yml - Service secrets in
workspace/net-1/services/(gitignored)
You can provision multiple networks with different ports:
make nvflare-provision NET_NUMBER=2 FL_PORT=8004 ADMIN_PORT=8005
Warning: Provisioned files contain cryptographic signatures. Any modification will cause errors. Always re-run provisioning if changes are needed.
2. Build Docker Images
make build NET_NUMBER=1
3. Start the FL Network
make up NET_NUMBER=1
This starts:
fl-server-net-1: Aggregation serverfl-client-1-net-1,fl-client-2-net-1: Training clientsflip-fl-api-net-1: FastAPI admin interface
To stop the network:
make down NET_NUMBER=1
To clean up images and containers:
make clean NET_NUMBER=1
Development Mode
DEV mode lets you test your FL applications locally before deploying to production.
Configure Environment
Edit .env.development:
LOCAL_DEV=true
DEV_IMAGES_DIR=../data/accession-resources # Path to your images
DEV_DATAFRAME=../data/sample_get_dataframe.csv # Path to your dataframe
JOB_TYPE=standard
Add Your Application Files
Place your files in src/<JOB_TYPE>/app/custom/:
trainer.py- Training logic (FLIP_TRAINER executor)validator.py- Validation logic (FLIP_VALIDATOR executor)models.py- Model definitions (get_modelfunction)config.json- Hyperparameters (requiresLOCAL_ROUNDSandLEARNING_RATE)transforms.py- Data transforms (optional)
Run the Simulator
make run-container
This runs the NVFLARE simulator in Docker with 2 clients, mounting your app folder for live changes.
Testing
Download Test Data
Download x-ray classification test data (requires AWS S3 access):
make download-xrays-data
Download spleen segmentation test data (requires AWS S3 access):
make download-spleen-data
Download model checkpoints for evaluation tests:
make download-checkpoints
Run Integration Tests
Test different job types with the spleen dataset:
# Standard federated training (classification task)
make test-xrays-standard
# Standard federated training (segmentation task)
make test-spleen-standard
# Model evaluation pipeline (requires model checkpoint file)
make test-spleen-evaluation
# Diffusion model training
make test-spleen-diffusion
# Run all integration tests
make test
Run Unit Tests
make unit-test
Manage Test Applications
Copy the spleen test app to your dev folder:
make copy-spleen-app
Save your changes back to the test folder:
make save-spleen-app
Project Structure
├── src/ # FL application types
│ ├── standard/ # Standard FedAvg training
│ ├── evaluation/ # Distributed model evaluation
│ ├── diffusion_model/ # Two-stage VAE + diffusion training
│ └── fed_opt/ # Custom federated optimization
├── fl_services/ # NVFLARE service definitions
│ ├── fl-base/ # Base Docker image
│ ├── fl-api-base/ # FastAPI admin service
│ ├── fl-client/ # Base FL client service
│ └── fl-server/ # Base FL server service
├── deploy/ # Docker compose files and templates
├── workspace/ # Provisioned secrets (gitignored)
├── tests/ # Integration test applications
| ├── examples/ # Example applications for integration testing
| └── unit/ # Unit tests
└── .env.development # Local environment configuration
Job Types
Set via JOB_TYPE environment variable:
| Type | Description |
|---|---|
standard |
Federated training with FedAvg aggregation (default) |
evaluation |
Distributed model evaluation without training |
diffusion_model |
Two-stage training (VAE encoder + diffusion) |
fed_opt |
Custom federated optimization |
NVFLARE App Structure
An NVFLARE app requires this structure:
app/
├── config/
│ ├── config_fed_server.json
│ └── config_fed_client.json
└── custom/
├── trainer.py
├── validator.py
├── models.py
└── config.json
For different configurations per client/server, use multiple app folders with a meta.json containing a deploy_map. See NVFLARE documentation.
Application and tutorials
Applications that will run on FLIP will take files from the app of choice (contained in both the custom and config folders described above), and files that are uploaded by the user to the UI. These files are customisable by the user, and examples compatible with different types of apps will be available in tutorials.
These are the following app / tutorial compatibilities:
| App | Tutorial |
|---|---|
standard |
image_segmentation/3d_spleen_segmentation |
diffusion_model |
image_synthesis/latent_diffusion_model |
fed_opt |
image_segmentation/3d_spleen_segmentation |
evaluation |
image_evaluation/3d_spleen_segmentation |
standard |
image_classification/xray_classification |
User Application Requirements
The standard application requires:
| File | Description |
|---|---|
trainer.py |
Training logic with FLIP_TRAINER class inheriting from Executor |
validator.py |
Validation logic with FLIP_VALIDATOR class inheriting from Executor |
models.py |
Model definitions with get_model() function |
config.json |
Must include LOCAL_ROUNDS and LEARNING_RATE |
Production Testing via GitHub Actions
Pull requests automatically push to a dev S3 bucket for testing:
s3://flipdev/base-application-dev/pull-requests/<PR_NUMBER>/src/
To test on the FLIP platform, update FL_APP_BASE_BUCKET in the flip repo environment variables to point to your PR's bucket.
S3 Bucket Mounting (Optional)
For automatic sync between local development and S3:
-
Install s3fs
-
Configure credentials:
echo ACCESS_KEY_ID:SECRET_ACCESS_KEY > ~/.passwd-s3fs chmod 600 ~/.passwd-s3fs
-
Mount the bucket:
s3fs flip:/base-application-dev/src/standard/app/ ./app/ -o passwd_file=${HOME}/.passwd-s3fs
For automatic mounting on boot, add to /etc/fstab:
flip <PATH_TO_APP>/app fuse.s3fs _netdev,allow_other 0 0
Test with mount -a before relying on it.
CI/CD
These workflows use GitHub OIDC to securely authenticate to AWS (no long-lived AWS keys required). They use an IAM role with a policy that allows S3 operations.
- PR to any branch: Pushes to dev S3 bucket for testing on AWS dev account:
- (dev)
s3://flipdev/base-application-dev/pull-requests/<PR_NUMBER>/src/
- (dev)
- Merge to develop: Syncs
src/to S3 buckets on AWS dev and staging accounts:- (dev)
s3://flipdev/base-application-dev/src/ - (staging)
s3://flipstag/base-application/src/
- (dev)
- Merge to main: Syncs
src/to S3 bucket in AWS prod account:- (prod)
s3://flipprod/base-application/src/
- (prod)
Warning: Never manually sync to the production bucket.
Makefile Reference
Network Management
| Command | Description |
|---|---|
make nvflare-provision NET_NUMBER=X |
Provision FL network X |
make build NET_NUMBER=X |
Build Docker images for network X |
make up NET_NUMBER=X |
Start FL network X |
make down NET_NUMBER=X |
Stop FL network X |
make clean NET_NUMBER=X |
Remove containers and images |
Development
| Command | Description |
|---|---|
make run-container |
Run NVFLARE simulator in Docker |
Testing Commands
| Command | Description |
|---|---|
make unit-test |
Run pytest unit tests |
make test-spleen-standard |
Test standard job with spleen data |
make test-spleen-evaluation |
Test evaluation job with spleen data |
make test-spleen-diffusion |
Test diffusion model with spleen data |
make test |
Run all integration tests |
Data Management
| Command | Description |
|---|---|
make download-spleen-data |
Download spleen test images from S3 |
make download-checkpoints |
Download model checkpoints from S3 |
make copy-spleen-app |
Copy test app to dev folder |
make save-spleen-app |
Save dev changes to test folder |
make pull-spleen-app |
Pull latest app from tutorials repo |
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file flip_utils-0.1.0.tar.gz.
File metadata
- Download URL: flip_utils-0.1.0.tar.gz
- Upload date:
- Size: 44.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.10.4 {"installer":{"name":"uv","version":"0.10.4","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5bb6b07410da7f9b18bf56f36d7680714963308e0bd691511425b5a7274190ac
|
|
| MD5 |
bb005631876327ed7a72741f21a4079d
|
|
| BLAKE2b-256 |
6ca488a99a9dd0ef92c01664e79a8db090bbc04389147aa1ad35f92c6888ce25
|
File details
Details for the file flip_utils-0.1.0-py3-none-any.whl.
File metadata
- Download URL: flip_utils-0.1.0-py3-none-any.whl
- Upload date:
- Size: 82.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.10.4 {"installer":{"name":"uv","version":"0.10.4","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
658bf9ce07700fff241d50a54f6b7d2435c618eba56ca3d268aacff08af8a1cc
|
|
| MD5 |
13b948f73224bc85669cd594d66ac2a9
|
|
| BLAKE2b-256 |
8a66dfe6f3447fb63e5a392f29f4b21e08a93063d008f105f0c534dc741c986e
|