Federated LoRA adapter aggregation framework — exact aggregation with FedEx-LoRA
Project description
Chorus
Federated LoRA fine-tuning with mathematically exact aggregation.
Chorus is a framework for federated fine-tuning of large language models using LoRA adapters. Multiple clients train on their private data, submit adapter deltas to a central server, and receive back aggregated improvements — without sharing any raw data.
The key insight: standard FedAvg is broken for LoRA because avg(B @ A) != avg(B) @ avg(A). Chorus implements FedEx-LoRA (ACL/ICLR 2025), which provides exact federated aggregation by tracking and folding SVD residuals.
How It Works
Client 1 (private data) Aggregation Server Client 2 (private data)
┌─────────────────────┐ ┌─────────────────────┐ ┌─────────────────────┐
│ 1. Train LoRA │ │ │ │ 1. Train LoRA │
│ 2. Submit delta ──┼──POST─┼→ Collect deltas │←─POST─┼── 2. Submit delta │
│ │ │ FedEx-LoRA agg │ │ │
│ │ │ Fold residuals │ │ │
│ 3. Pull updated ←─┼──GET──┼─ Serve result │──GET──┼→ 3. Pull updated │
│ 4. Repeat │ │ WS: round_complete │ │ 4. Repeat │
└─────────────────────┘ └─────────────────────┘ └─────────────────────┘
Installation
pip install chorus-fl
With optional dependencies:
# For local LoRA training (PEFT + Transformers)
pip install "chorus-fl[peft]"
# For differential privacy
pip install "chorus-fl[privacy]"
# Everything
pip install "chorus-fl[all]"
From source:
git clone https://github.com/varmabudharaju/chorus.git
cd chorus
pip install -e ".[dev]"
Quick Start
1. Start the server
chorus server --model meta-llama/Llama-3.2-3B --min-deltas 3
2. Submit adapters from clients
from chorus import ChorusClient
client = ChorusClient(
server="http://localhost:8080",
model_id="meta-llama/Llama-3.2-3B",
)
# After your local LoRA training...
client.submit_delta(adapter_path="./my-adapter")
# Pull the aggregated global adapter
client.pull_latest(output_path="./updated-adapter")
client.close()
3. Run a simulation (no server needed)
# Compare FedAvg vs FedEx-LoRA
chorus simulate --clients 10 --rounds 5 --compare
Why FedEx-LoRA?
LoRA decomposes weight updates as W = B @ A (two low-rank matrices). When you naively average across clients:
avg(B_i @ A_i) != avg(B_i) @ avg(A_i)
FedAvg produces mathematically inexact aggregation for LoRA. FedEx-LoRA fixes this:
- Computes the exact weighted average of full-rank products
B_i @ A_i - Uses SVD to get the optimal rank-r approximation (Eckart-Young theorem)
- Tracks the residual between exact and approximate results
- Folds residuals into base weights, making the combined result exact
CLI Reference
chorus server
Start the aggregation server.
chorus server --model <model-id> [options]
| Option | Default | Description |
|---|---|---|
--model |
required | Model ID (e.g. meta-llama/Llama-3.2-3B) |
--port |
8080 |
Port to listen on |
--host |
0.0.0.0 |
Host to bind to |
--data-dir |
./chorus_data |
Data directory for storage |
--strategy |
fedex-lora |
Aggregation strategy (fedavg or fedex-lora) |
--min-deltas |
2 |
Minimum deltas before aggregation triggers |
--dp-epsilon |
disabled | Server-side differential privacy epsilon |
--api-key |
disabled | API key for auth (can specify multiple times) |
--base-weights |
none | Path to base model weights (.safetensors) |
--norm-bound |
disabled | Max L2 norm for Byzantine defense |
--outlier-threshold |
disabled | Z-score threshold for outlier detection |
--rate-limit |
0 |
Max requests per minute per IP (0 = disabled) |
-v, --verbose |
Verbose logging |
chorus submit
Submit a LoRA adapter delta to the server.
chorus submit --server <url> --adapter <path> [options]
| Option | Default | Description |
|---|---|---|
--server |
required | Server URL |
--adapter |
required | Path to adapter directory or .safetensors file |
--model-id |
auto | Model ID (auto-detected from server) |
--client-id |
auto | Client identifier |
--round-id |
current | Target round |
--dp-epsilon |
disabled | Local DP epsilon |
--dataset-size |
none | Dataset size for weighted aggregation |
--api-key |
none | API key for authentication |
chorus pull
Pull the latest aggregated adapter from the server.
chorus pull --server <url> --output <path> [options]
chorus train
Run the full federated training loop (train -> submit -> wait -> pull -> repeat).
chorus train --server <url> --model <hf-model-id> --dataset <dataset> [options]
| Option | Default | Description |
|---|---|---|
--server |
required | Server URL |
--model |
required | HuggingFace model ID |
--dataset |
required | HuggingFace dataset or local path |
--rounds |
infinite | Number of training rounds |
--lora-rank |
16 |
LoRA rank |
--max-steps |
-1 |
Max training steps per round (-1 = full epoch) |
--dp-epsilon |
disabled | Local DP epsilon |
chorus simulate
Run a simulated federation with synthetic data.
chorus simulate --clients 10 --rounds 5 --compare
chorus status
Show the current status of a Chorus server.
chorus status --server <url>
chorus export
Export a merged model (base + aggregated adapter) ready for deployment.
chorus export --server <url> --model <hf-model-id> --output ./merged/
Python SDK
ChorusClient
from chorus import ChorusClient
client = ChorusClient(
server="http://localhost:8080",
model_id="my-model",
client_id="client-1", # optional, auto-generated if omitted
api_key="secret", # optional, for authenticated servers
dp_epsilon=1.0, # optional, local differential privacy
dp_delta=1e-5, # optional, DP delta parameter
dp_max_norm=1.0, # optional, DP clipping norm
timeout=120.0, # optional, HTTP timeout in seconds
)
# Check server status
status = client.status()
# Submit a trained LoRA adapter
result = client.submit_delta(
adapter_path="./my-adapter", # PEFT adapter dir or .safetensors
round_id=None, # None = current round
dataset_size=5000, # for weighted aggregation
)
# Submit raw tensors directly
result = client.submit_tensors(tensors={"layer.lora_A.weight": tensor_a, ...})
# Pull the latest aggregated adapter
client.pull_latest(output_path="./updated-adapter")
# Pull a specific round
client.pull_round(round_id=3, output_path="./round-3-adapter")
# Export merged model (requires chorus[peft])
client.export_model(
base_model="meta-llama/Llama-3.2-3B",
output_dir="./merged-model",
)
# Full training loop (requires chorus[peft])
client.train_loop(
trainer=my_trainer, # LoRATrainer instance
rounds=5,
)
# Listen for round completion via WebSocket
for event in client.listen():
print(f"Round {event['round_id']} complete!")
client.close()
# Or use as context manager:
# with ChorusClient(...) as client:
# ...
API Endpoints
| Method | Path | Description |
|---|---|---|
GET |
/health |
Health check (public, includes ws_clients count) |
GET |
/models/{id}/status |
Round state, delta count, latest round |
POST |
/rounds/{round_id}/deltas |
Submit LoRA delta (dataset_size param for weighting) |
GET |
/models/{id}/latest |
Download latest aggregated adapter |
GET |
/models/{id}/rounds/{round_id} |
Download round-specific adapter |
POST |
/models/{id}/base-weights |
Upload base model weights |
GET |
/models/{id}/base-weights |
Download current base weights |
GET |
/models/{id}/checkpoint |
Download base + adapter merged checkpoint |
WS |
/ws/{client_id} |
WebSocket for live round notifications |
Architecture
chorus/
├── patterns.py # Shared LoRA key patterns
├── exceptions.py # Exception hierarchy (ChorusError, etc.)
├── server/
│ ├── app.py # FastAPI endpoints + auth + async aggregation
│ ├── aggregation.py # FedAvg + FedEx-LoRA (SVD) + Byzantine defenses
│ ├── storage.py # Filesystem storage for deltas, base weights, round state
│ ├── weight_manager.py # Residual folding into base weights
│ ├── ws.py # WebSocket connection manager
│ └── privacy.py # Gaussian DP mechanism + L2 clipping
├── client/
│ ├── sdk.py # ChorusClient (submit, pull, listen, train_loop, export)
│ ├── trainer.py # LoRATrainer wrapper for HF PEFT
│ └── delta.py # LoRA matrix extraction from PEFT adapters
├── cli/
│ └── main.py # Click CLI with error handling
└── simulate/
└── runner.py # Synthetic multi-client federation runner
Security Features
Chorus includes several security mechanisms for production deployments:
- Authentication — Bearer token auth via
--api-key(supports multiple keys) - Differential privacy — Gaussian DP with global L2 clipping at both client and server level
- Byzantine defenses — L2 norm bounding (
--norm-bound) and z-score outlier detection (--outlier-threshold) reject malicious or corrupted deltas - Rate limiting — Per-IP request throttling via
--rate-limit - safetensors only — All weight serialization uses safetensors format (no pickle deserialization)
Note: Chorus serves over HTTP. For production, deploy behind a TLS-terminating reverse proxy (nginx, Caddy, etc.).
Aggregation Strategies
| Strategy | Exact? | Description |
|---|---|---|
fedex-lora (default) |
Yes | SVD-based exact aggregation with residual folding |
fedavg |
No | Naive independent averaging of A and B matrices |
Configuration Examples
Secure production server
chorus server \
--model meta-llama/Llama-3.2-3B \
--min-deltas 5 \
--api-key $SECRET_KEY_1 \
--api-key $SECRET_KEY_2 \
--dp-epsilon 2.0 \
--norm-bound 10.0 \
--outlier-threshold 3.0 \
--rate-limit 60 \
--base-weights ./base-model.safetensors
Client with local DP
client = ChorusClient(
server="http://chorus.internal:8080",
model_id="meta-llama/Llama-3.2-3B",
api_key="my-secret-key",
dp_epsilon=1.0, # strong local DP
dp_max_norm=1.0, # clip before noising
)
Full training loop
chorus train \
--server http://localhost:8080 \
--model meta-llama/Llama-3.2-3B \
--dataset wikitext \
--rounds 10 \
--lora-rank 16
Examples
See the examples/ directory:
quickstart.py— Basic 2-client workflow with synthetic adaptershealth_metrics/federated_health.py— Multi-hospital federated training simulation with DP
Development
git clone https://github.com/varmabudharaju/chorus.git
cd chorus
pip install -e ".[dev]"
# Run tests (165 tests)
pytest tests/ -v
# Run benchmarks
python benchmarks/benchmark.py
Contributing
Contributions are welcome! Here's how to get started:
- Fork the repository
- Create a feature branch (
git checkout -b feature/my-feature) - Make your changes and add tests
- Run the test suite (
pytest tests/ -v) - Submit a pull request
Please open an issue first to discuss significant changes.
License
Apache 2.0 — see LICENSE for the full text.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file chorus_fl-0.1.0.tar.gz.
File metadata
- Download URL: chorus_fl-0.1.0.tar.gz
- Upload date:
- Size: 57.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d10224800dc89f5f3859988c8d4d7fa8a5650cc6f2bf0f35ea0c20ec1eac5aa1
|
|
| MD5 |
120a0fc28141b0c3145e27c3e0025871
|
|
| BLAKE2b-256 |
60ce3e6a0a14b43ba2351866deb0bf4eb03d7ad0094ee8fa02a8b140f29efc69
|
File details
Details for the file chorus_fl-0.1.0-py3-none-any.whl.
File metadata
- Download URL: chorus_fl-0.1.0-py3-none-any.whl
- Upload date:
- Size: 43.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a805f9b89f3f5047f9afcda23f39e2eb05a61eb3d1038429c50d55be21ab6115
|
|
| MD5 |
9ac23decff9b0ac68f2535784ec8dc7f
|
|
| BLAKE2b-256 |
a2950c9ca6b0553fc02fc80f6c44aa3173ebd70aa3c55ce51ab9503d2c218ab1
|