Python SDK for communication with Hafnia platform.
Project description
Hafnia
The hafnia python sdk and cli is a collection of tools to create and run model trainer packages on
the Hafnia Platform.
The package includes the following interfaces:
cli: A Command Line Interface (CLI) to 1) configure/connect to Hafnia's Training-aaS, 2) manage datasets, dataset recipes and trainer packages, and 3) build and launch trainer packages locally.hafnia: A python package includingHafniaDatasetto manage datasets,DatasetRecipeto compose reproducible dataset transformations andHafniaLoggerfor experiment tracking.
The Concept: Training as a Service (Training-aaS)
Training-aaS is the concept of training models on the Hafnia platform on large
and hidden datasets. Hidden datasets refers to datasets that can be used for
training, but are not available for download or direct access.
This is a key for the Hafnia platform, as a hidden dataset ensures data privacy, and allow models to be trained compliantly and ethically by third parties (you).
The Training-aaS concept involves packaging your custom training project as a trainer package and using the package to train models on the hidden datasets.
To support local development of a trainer package, we have introduced a sample dataset for each dataset available in the Hafnia data library. The sample dataset is a small and an anonymized subset of the full dataset and available for download.
With the sample dataset, you can seamlessly switch between local development and Training-aaS. Locally, you can create, validate and debug your trainer package. The trainer package is then launched with Training-aaS, where the package runs on the full dataset and can be scaled to run on multiple GPUs and instances if needed.
Quick Start: No-Code Model Training
To demonstrate the concept of Training-aaS, we will first show how to launch model training using the Hafnia Training-aaS platform - without writing any code - using a pre-built public trainer package.
Steps:
-
Sign in
To sign in to the Hafnia Platform. -
Access the Dashboard
Go do the dashboard, select Training-aaS and click "Create Experiment" -
Select Dataset
Choose your target dataset (e.g.,coco-2017ormidwest-vehicle-detection) -
Select Trainer Package
Use the public trainer package provided by Hafnia. Select the "Public Trainers" tab and choose the "Object Detection Trainer" package. You may also upload your own trainer package, but we will describe that later. -
Configure Training
- Training command:
python scripts/train.py - Configuration: Select "Free Tier" or "Professional" based on your needs
- Training command:
-
Launch & Monitor
Click "Create Experiment" and monitor progress in the dashboard
That's it! You have successfully launched an object detection model training experiment using the Hafnia Training-aaS platform.
For default training parameters, the trainer package converges in approximately 4 hours on the midwest-vehicle-detection dataset using the "Free Tier" configuration.
Installation and Configuration
To use the CLI for managing datasets and trainer packages — and to load datasets locally with the
Python SDK — install hafnia and configure it with your API key.
-
Install
hafniawith your favorite python package manager:# With uv package manager uv add hafnia
-
Sign in to the Hafnia Platform.
-
Create an API KEY for Training aaS. For more instructions, follow this guide. Copy the key and save it for later use.
-
From terminal, configure your machine to access Hafnia:
# Start configuration with hafnia configure # You are then prompted: Alias: # Press [Enter] to skip — personal label only, not tied to your Hafnia account Hafnia API Key: # Pass your HAFNIA API key Hafnia Platform URL [default https://api.hafnia.milestonesys.com]: # Press [Enter] to use the default -
Download
mnistfrom terminal to verify that your configuration is working.hafnia dataset download mnist --force
CLI command surface
Once configured, the hafnia CLI exposes the following command groups. Each group has a --help
flag that shows the available subcommands and options:
hafnia configure Interactive first-time setup (profile name, API key, URL)
hafnia clear Remove all stored configuration
hafnia profile ls | active | use | rm | create # Manage local profiles
hafnia dataset ls | download | delete # Manage datasets on the platform
hafnia recipe ls | create | rm # Manage dataset recipes on the platform
hafnia trainer ls | create | update | create-zip | view-zip # Manage trainer packages
hafnia experiment ls | create | environments # Launch and inspect experiments
hafnia runc build | build-local | launch-local # Build and run trainer packages locally
Run hafnia <group> --help (and hafnia <group> <subcommand> --help) to see the full set of
options for any command.
Trainer Packages: Bring Your Own Training Code
When the public trainer packages are not enough — for example because you want a different model architecture, a custom training loop, or your own evaluation metrics — you can package your own training script as a trainer package and launch it on Training-aaS just like the public ones.
We provide two reference trainer packages that serve as templates for creating and structuring your own trainers:
- trainer-classification — image classification.
- trainer-object-detection — object detection.
Each repository contains additional information on how to structure your trainer package, use the
HafniaLogger, load a dataset and launch the trainer on the Hafnia platform.
Managing trainer packages from the CLI
Trainer packages can be created and updated on the platform directly from the CLI:
hafnia trainer ls # list trainer packages on the platform
hafnia trainer create ../trainer-classification # upload a new trainer package
hafnia trainer update <trainer-id> ../trainer-classification # push a new version
hafnia trainer view-zip trainer.zip # inspect the contents of a trainer.zip
To get a better understanding we advice to visit the trainer package repositories above
Build and run a trainer.zip locally
To test trainer-package compatibility with Hafnia cloud before uploading, build and run the job locally:
# Create 'trainer.zip' from the root folder of your trainer project (e.g. '../trainer-classification')
hafnia trainer create-zip ../trainer-classification
# Build the docker image locally from a 'trainer.zip' file
hafnia runc build-local trainer.zip
# Execute the docker image locally with a desired dataset
hafnia runc launch-local --dataset mnist "python scripts/train.py"
The hafnia Python Package
The hafnia Python package provides everything needed to build trainer packages: dataset loading
and manipulation, custom-dataset construction, recipe-based dataset composition, torch integration,
experiment tracking and inference benchmarking. Each feature has a runnable script under
examples/.
Loading datasets with HafniaDataset
HafniaDataset is the main entry point for loading and manipulating datasets. The same
HafniaDataset.from_name call returns the sample dataset locally and the full dataset when
running under Training-aaS — so a training script does not need to change between the two
environments.
from hafnia.dataset.hafnia_dataset import HafniaDataset, Sample
# Load by name (downloads to .data/datasets/ on first call)
dataset = HafniaDataset.from_name("midwest-vehicle-detection", version="1.0.0")
# Or load from a local path
# dataset = HafniaDataset.from_path(path_dataset)
dataset.print_stats()
dataset_train = dataset.create_split_dataset("train")
# Iterate samples
for sample_dict in dataset:
sample = Sample(**sample_dict)
image = sample.read_image()
print(sample.sample_index, sample.bboxes)
break
dataset.info carries the dataset metadata (DatasetInfo with TaskInfo per task) and
dataset.samples is a Polars DataFrame whose primitive columns (classifications, bboxes,
bitmasks, polygons) mirror the corresponding Sample fields.
Available datasets (and their sample variants) are listed in the data library including metadata and a description for each dataset.
For a deeper walkthrough of HafniaDataset, see
examples/example_hafnia_dataset.py.
Composing datasets with DatasetRecipe
A DatasetRecipe is a serializable specification of a dataset and the operations applied to it
(shuffle, select, merge, split, filter, remove classes, ...). Recipes are not executed when defined
— call .build() to materialize a HafniaDataset. This makes recipes ideal for sharing a dataset
configuration across local development and Training-aaS, and for combining multiple sources into a
single training dataset.
from hafnia.dataset.dataset_recipe.dataset_recipe import DatasetRecipe
recipe = DatasetRecipe.from_name(name="mnist", version="1.0.0").shuffle().select_samples(n_samples=20)
dataset = recipe.build()
# Use thus function to upload the recipe and make it available for Training-aaS experiments
recipe.as_platform_recipe(recipe_name="example-mnist-recipe", overwrite=True)
Recipes can also be managed on the platform via the CLI: hafnia recipe ls | create | rm.
See examples/example_dataset_recipe.py for a full walkthrough including merging datasets and class-level operations.
Bringing your own data: custom HafniaDataset
If you have data that is not yet on the Hafnia platform, you can construct a HafniaDataset
directly from images and annotations. See
examples/example_custom_dataset.py for an end-to-end example
that builds a HafniaDataset from a YOLO-formatted directory using Sample, Bbox and
DatasetInfo. Built-in importers such as HafniaDataset.from_yolo_format are also available for
common formats.
Torch dataloader
For torch-based training scripts, a dataset is typically used together with a dataloader that
performs augmentation and batching.
examples/example_torchvision_dataloader.py shows how
to load a dataset, apply augmentations using torchvision.transforms.v2, visualize samples with
torch_helpers.draw_image_and_targets and combine TorchvisionDataset with TorchVisionCollateFn
in a torch.utils.data.DataLoader:
# Load HafniaDataset (sample dataset locally, full dataset under Training-aaS)
dataset = HafniaDataset.from_name("midwest-vehicle-detection", version="1.0.0")
dataset_train = dataset.create_split_dataset("train")
dataset_test = dataset.create_split_dataset("test")
train_transforms = v2.Compose([
v2.RandomResizedCrop(size=(224, 224), antialias=True),
v2.RandomHorizontalFlip(p=0.5),
v2.ToDtype(torch.float32, scale=True),
v2.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
train_dataset = torch_helpers.TorchvisionDataset(dataset_train, transforms=train_transforms)
collate_fn = torch_helpers.TorchVisionCollateFn()
train_loader = DataLoader(train_dataset, batch_size=8, shuffle=True, collate_fn=collate_fn)
Experiment tracking with HafniaLogger
The HafniaLogger is an important part of the trainer and enables you to track, log and
reproduce your experiments.
When integrated into your training script, the HafniaLogger is responsible for collecting:
- Trained Model: The model trained during the experiment
- Model Checkpoints: Intermediate model states saved during training
- Experiment Configurations: Hyperparameters and other settings used in your experiment
- Training/Evaluation Metrics: Performance data such as loss values, accuracy, and custom metrics
from hafnia.experiment import HafniaLogger
logger = HafniaLogger(project_name="my_classification_project")
logger.log_configuration({"batch_size": 128, "learning_rate": 0.001})
ckpt_dir = logger.path_model_checkpoints() # store checkpoints here
model_dir = logger.path_model() # store the trained model here
logger.log_scalar("train/loss", value=0.1, step=100)
logger.log_metric("train/accuracy", value=0.98, step=100)
logger.log_scalar("validation/loss", value=0.1, step=100)
logger.log_metric("validation/accuracy", value=0.95, step=100)
The tracker behaves differently when running locally or in the cloud.
Locally, experiment data is stored in a local folder .data/experiments/{DATE_TIME}.
In the cloud, the experiment data will be available in the Hafnia platform under
experiments.
See examples/example_logger.py for the runnable version.
See also trainer-classification on how
to use it in a real training script.
Benchmarking models on a HafniaDataset
The benchmark utilities run an InferenceModel over a dataset, store the predictions as a new
dataset and compute metrics (e.g. object-detection mAP) against the ground truth. See
examples/example_benchmark.py for a runnable example wrapping a
torchvision SSDLite detector.
Detailed Documentation
For more information, go to our documentation page or in below markdown pages.
- CLI - Detailed guide for the Hafnia command-line interface
- Release lifecycle - Details about package release lifecycle.
Development
For development, we are using an uv based virtual python environment.
Install uv (linux and macOS)
curl -LsSf https://astral.sh/uv/install.sh | sh
Create virtual environment and install python dependencies
uv sync --dev
Run tests:
uv run pytest tests
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hafnia-0.7.5.tar.gz.
File metadata
- Download URL: hafnia-0.7.5.tar.gz
- Upload date:
- Size: 13.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5723e653057c70b040c4cbf897bffabe9f881e38559345f6e6e429864cdcd019
|
|
| MD5 |
56e7e694a6ea069190306146ab5909a5
|
|
| BLAKE2b-256 |
1ba2ce5fa85d8227ce992327d35995fdda6282b7731f472ff49cc33bcbbf9d39
|
File details
Details for the file hafnia-0.7.5-py3-none-any.whl.
File metadata
- Download URL: hafnia-0.7.5-py3-none-any.whl
- Upload date:
- Size: 179.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2ba9dd2577891e7f4f1e7cdf8ec4303d1988308d7b4dca8fd0eb2c407f044398
|
|
| MD5 |
1493f80fb97601673e8090319feaa16f
|
|
| BLAKE2b-256 |
7350bdc969ac54c5a900596b0348b30b69e6c006e7b5dc357167873ae7d2fe41
|