Local-first dataset toolkit for multimodal federated learning artifacts (partition/feature/simulation)

These details have not been verified by PyPI

Project description

fedops-dataset

fedops-dataset is a local-first dataset toolkit for multimodal FL experiments (FedMS2-v8 style).

It supports:

raw data bootstrap (fetch-raw)
partition/feature/simulation generation (create-v8)
Python API for runtime-driven usage (FedOpsLocalDataset)

Python requirement: >=3.8

Install

pip install fedops-dataset

Local-First Quickstart

1) Fetch or setup raw data

# CREMA-D
fedops-dataset fetch-raw --dataset crema_d --data-root /path/to/fed_multimodal/data

# PTB-XL
fedops-dataset fetch-raw --dataset ptb-xl --data-root /path/to/fed_multimodal/data

# Hateful Memes (auto-download from HF repo)
fedops-dataset fetch-raw \
  --dataset hateful_memes \
  --data-root /path/to/fed_multimodal/data \
  --hateful-memes-repo-id neuralcatcher/hateful_memes \
  --hateful-memes-revision main

# Hateful Memes (manual prepared source folder)
fedops-dataset fetch-raw \
  --dataset hateful_memes \
  --data-root /path/to/fed_multimodal/data \
  --hateful-memes-source-dir /path/to/hateful_memes_source \
  --hateful-memes-mode symlink

2) Validate raw roots

fedops-dataset check-raw-datasets \
  --data-root /path/to/fed_multimodal/data \
  --hateful-memes-root /path/to/fed_multimodal/data/hateful_memes

3) Generate v8 artifacts (`alpha`, `ps`, `pm`)

# Dry run first
fedops-dataset create-v8 \
  --dataset hateful_memes \
  --alpha 0.1 \
  --sample-missing-rate 0.2 \
  --modality-missing-rate 0.8 \
  --repo-root /path/to/fed-multimodal \
  --data-root /path/to/fed_multimodal/data \
  --hateful-memes-root /path/to/fed_multimodal/data/hateful_memes \
  --dry-run

# Real run
fedops-dataset create-v8 \
  --dataset hateful_memes \
  --alpha 50 \
  --sample-missing-rate 0.2 \
  --modality-missing-rate 0.8 \
  --repo-root /path/to/fed-multimodal \
  --data-root /path/to/fed_multimodal/data \
  --hateful-memes-root /path/to/fed_multimodal/data/hateful_memes

Note on alpha:

both --alpha 5.0 and --alpha 50 resolve to artifact token alpha50
--alpha 0.1 resolves to alpha01

Python API (Runtime-Driven)

Direct local usage

from fedops_dataset import FedOpsLocalDataset

ds = FedOpsLocalDataset(
    dataset="hateful_memes",
    alpha=0.1,
    sample_missing_rate=0.2,   # ps
    modality_missing_rate=0.8, # pm
    repo_root="/path/to/fed-multimodal",
    data_root="/path/to/fed_multimodal/data",
    hateful_memes_root="/path/to/fed_multimodal/data/hateful_memes",
)

ds.prepare(dry_run=False)
partition = ds.load_partition()
simulation = ds.load_simulation()
client0_records = ds.client_records(0, use_simulation=True)

Flower-style runtime config usage

from fedops_dataset import FedOpsLocalDataset

run_config = {
    "repo-root": "/path/to/fed-multimodal",
    "data-root": "/path/to/fed_multimodal/data",
    "hateful-memes-root": "/path/to/fed_multimodal/data/hateful_memes",
}

# Simulation mode example (Flower simulation engine)
node_config = {"partition-id": 0, "num-partitions": 10}

ds = FedOpsLocalDataset.from_runtime_config(
    dataset="crema_d",
    alpha=0.1,
    sample_missing_rate=0.2,
    modality_missing_rate=0.2,
    run_config=run_config,
    node_config=node_config,
)

mode = ds.node_mode(node_config)  # "simulation"
records = ds.client_records_from_node_config(node_config, use_simulation=True)

Path Semantics

Simulation mode:
- detected when node_config has partition-id and num-partitions
- client records can be resolved from partition-id
Deployment mode:
- if node_config has data-path, it is used as runtime data root
- each node can point to its own local data path
No hardcoded path is required:
- pass run_config/node_config, CLI args, or env vars

Environment Variables

export FEDOPS_REPO_ROOT=/path/to/fed-multimodal
export FEDOPS_OUTPUT_DIR=/path/to/fed-multimodal/fed_multimodal/output
export FEDOPS_DATA_ROOT=/path/to/fed_multimodal/data
export HATEFUL_MEMES_ROOT=/path/to/fed_multimodal/data/hateful_memes

Optional HF Artifact Client

FedOpsDatasetClient remains available if you also host artifacts in a Hugging Face dataset repo. It is optional for local/original-data mode.

Maintainer Release

cd fedops_dataset
export TWINE_USERNAME=__token__
export TWINE_PASSWORD=<pypi-token>
./scripts/publish_pypi.sh

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.3.7

Feb 24, 2026

0.3.6

Feb 24, 2026

0.3.5

Feb 24, 2026

This version

0.3.4

Feb 24, 2026

0.3.3

Feb 24, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fedops_dataset-0.3.4.tar.gz (27.7 kB view details)

Uploaded Feb 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fedops_dataset-0.3.4-py3-none-any.whl (29.0 kB view details)

Uploaded Feb 24, 2026 Python 3

File details

Details for the file fedops_dataset-0.3.4.tar.gz.

File metadata

Download URL: fedops_dataset-0.3.4.tar.gz
Upload date: Feb 24, 2026
Size: 27.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.8.20

File hashes

Hashes for fedops_dataset-0.3.4.tar.gz
Algorithm	Hash digest
SHA256	`f3ea39d84cfe007b0172254a5a547d56731c8f345865584296d129984fa00c16`
MD5	`05bdd83670bf31c33cabf2c7a9e4ad01`
BLAKE2b-256	`2ccdf51bb5c0f3e8c60556c6c69ae53f3029706476507fc7f9c9bc60299e24f1`

See more details on using hashes here.

File details

Details for the file fedops_dataset-0.3.4-py3-none-any.whl.

File metadata

Download URL: fedops_dataset-0.3.4-py3-none-any.whl
Upload date: Feb 24, 2026
Size: 29.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.8.20

File hashes

Hashes for fedops_dataset-0.3.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`37bbd79ab25c482f6a2b68d9f8fe42eebef7dd6980263754706fd3da874a5ccd`
MD5	`e0e3e77edb27c38526149b17c47db218`
BLAKE2b-256	`2723f1f170c4001a8705216bd96d28fbc506546629949c5daf2aca96f667ecea`

See more details on using hashes here.

fedops-dataset 0.3.4

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

fedops-dataset

Install

Local-First Quickstart

1) Fetch or setup raw data

2) Validate raw roots

3) Generate v8 artifacts (`alpha`, `ps`, `pm`)

Python API (Runtime-Driven)

Direct local usage

Flower-style runtime config usage

Path Semantics

Environment Variables

Optional HF Artifact Client

Maintainer Release

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

fedops-dataset 0.3.4

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

fedops-dataset

Install

Local-First Quickstart

1) Fetch or setup raw data

2) Validate raw roots

3) Generate v8 artifacts (alpha, ps, pm)

Python API (Runtime-Driven)

Direct local usage

Flower-style runtime config usage

Path Semantics

Environment Variables

Optional HF Artifact Client

Maintainer Release

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

3) Generate v8 artifacts (`alpha`, `ps`, `pm`)