Skip to main content

A library for interacting with the Datamint API, designed for efficient data management, processing and Deep Learning workflows.

Project description

Datamint Python API

Build Status Python 3.10+

A comprehensive Python SDK for interacting with the Datamint platform, providing seamless integration for medical imaging workflows, dataset management, and machine learning experiments.

📋 Table of Contents

🚀 Features

  • Dataset Management: Download, upload, and manage medical imaging datasets
  • Annotation Tools: Create, upload, and manage annotations (segmentations, labels, measurements)
  • Experiment Tracking: Integrated MLflow support for experiment management
  • PyTorch Lightning Integration: Streamlined ML workflows with Lightning DataModules and callbacks
  • DICOM Support: Native handling of DICOM files with anonymization capabilities
  • Multi-format Support: PNG, JPEG, NIfTI, and other medical imaging formats

See the full documentation at https://sonanceai.github.io/datamint-python-api/

📦 Installation

[!NOTE] We recommend using a virtual environment to avoid package conflicts.

From PyPI

To be released soon

From Source

pip install git+https://github.com/SonanceAI/datamint-python-api

Virtual Environment Setup

Click to expand virtual environment setup instructions

We recommend that you install Datamint in a dedicated virtual environment, to avoid conflicting with your system packages. For instance, create the enviroment once with python3 -m venv datamint-env and then activate it whenever you need it with:

  1. Create the environment (one-time setup):

    python3 -m venv datamint-env
    
  2. Activate the environment (run whenever you need it):

    Platform Command
    Linux/macOS source datamint-env/bin/activate
    Windows CMD datamint-env\Scripts\activate.bat
    Windows PowerShell datamint-env\Scripts\Activate.ps1
  3. Install the package:

    pip install git+https://github.com/SonanceAI/datamint-python-api
    

Setup API key

To use the Datamint API, you need to setup your API key (ask your administrator if you don't have one). Use one of the following methods to setup your API key:

Method 1: Command-line tool (recommended)

Run datamint-config in the terminal and follow the instructions. See command_line_tools for more details.

Method 2: Environment variable

Specify the API key as an environment variable.

Bash:

export DATAMINT_API_KEY="my_api_key"
# run your commands (e.g., `datamint-upload`, `python script.py`)

Python:

import os
os.environ["DATAMINT_API_KEY"] = "my_api_key"

📚 Documentation

Resource Description
🚀 Getting Started Step-by-step setup and basic usage
📖 API Reference Complete API documentation
🔥 PyTorch Integration ML workflow integration
💡 Examples Practical usage examples

🔗 Key Components

Dataset Management

from datamint import Dataset

# Load dataset with annotations
dataset = Dataset(
    project_name="medical-segmentation",
)

# Access data
for sample in dataset:
    image = sample['image']       # torch.Tensor
    mask = sample['segmentation'] # torch.Tensor (if available)
    metadata = sample['metainfo'] # dict

PyTorch Lightning Integration

import lightning as L
from datamint.lightning import DatamintDataModule
from datamint.mlflow.lightning.callbacks import MLFlowModelCheckpoint

# Data module
datamodule = DatamintDataModule(
    project_name="your-project",
    batch_size=16,
    train_split=0.8
)

# ML tracking callback
checkpoint_callback = MLFlowModelCheckpoint(
    monitor="val_loss",
    save_top_k=1,
    register_model_name="best-model"
)

# Trainer with MLflow logging
trainer = L.Trainer(
    max_epochs=100,
    callbacks=[checkpoint_callback],
    logger=L.pytorch.loggers.MLFlowLogger(
        experiment_name="medical-segmentation"
    )
)

Annotation Management

# Upload segmentation masks
api.upload_segmentations(
    resource_id="resource-123",
    file_path="segmentation.nii.gz",
    name="liver_segmentation",
    frame_index=0
)

# Add categorical annotations
api.add_image_category_annotation(
    resource_id="resource-123",
    identifier="diagnosis",
    value="positive"
)

# Add geometric annotations
api.add_line_annotation(
    point1=(10, 20),
    point2=(50, 80),
    resource_id="resource-123",
    identifier="measurement",
    frame_index=5
)

🛠️ Command Line Tools

Upload Resources

Upload DICOM files with anonymization:

datamint-upload \
    --path /path/to/dicoms \
    --recursive \
    --channel "training-data" \
    --anonymize \
    --publish

Upload with segmentation masks:

datamint-upload \
    --path /path/to/images \
    --segmentation_path /path/to/masks \
    --segmentation_names segmentation_config.yaml

Configuration Management

# Interactive setup
datamint-config

# Set API key
datamint-config --api-key "your-key"

🔍 Examples

Medical Image Segmentation Pipeline

import torch
import lightning as L
from datamint.lightning import DatamintDataModule
from datamint.mlflow.lightning.callbacks import MLFlowModelCheckpoint

class SegmentationModel(L.LightningModule):
    def __init__(self):
        super().__init__()
        # Model definition...
    
    def training_step(self, batch, batch_idx):
        # Training logic...
        pass

# Setup data
datamodule = DatamintDataModule(
    project_name="liver-segmentation",
    batch_size=8,
    train_split=0.8
)

# Setup model with MLflow tracking
model = SegmentationModel()
checkpoint_cb = MLFlowModelCheckpoint(
    monitor="val_dice",
    mode="max",
    register_model_name="liver-segmentation-model"
)

# Train
trainer = L.Trainer(
    max_epochs=50,
    callbacks=[checkpoint_cb],
    logger=L.pytorch.loggers.MLFlowLogger()
)
trainer.fit(model, datamodule)

🆘 Support

Full Documentation GitHub Issues

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datamint-2.4.0.tar.gz (131.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

datamint-2.4.0-py3-none-any.whl (156.6 kB view details)

Uploaded Python 3

File details

Details for the file datamint-2.4.0.tar.gz.

File metadata

  • Download URL: datamint-2.4.0.tar.gz
  • Upload date:
  • Size: 131.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for datamint-2.4.0.tar.gz
Algorithm Hash digest
SHA256 17a7bdae0787ca7419ca9285bdb2067360bd9b7935f39b12a777f018cf73f792
MD5 34550b6da5f530088f0dfd81d41cc1dc
BLAKE2b-256 2b4eb101f2b8c0c6547b232ad6b7711ec6ae8e7462e83cda5339a20f812c8b50

See more details on using hashes here.

Provenance

The following attestation bundles were made for datamint-2.4.0.tar.gz:

Publisher: release_pypi.yaml on SonanceAI/datamint-python-api

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file datamint-2.4.0-py3-none-any.whl.

File metadata

  • Download URL: datamint-2.4.0-py3-none-any.whl
  • Upload date:
  • Size: 156.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for datamint-2.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c714e2c940899166363133177c54dee0ff960598555b26f25c6b92ca49787410
MD5 4ea90103cff5af307df8ee12f56f7aa4
BLAKE2b-256 220dd593508016c431c9f4f314e4b5d5bed9cd7bf22dde7a75b7e1c524811100

See more details on using hashes here.

Provenance

The following attestation bundles were made for datamint-2.4.0-py3-none-any.whl:

Publisher: release_pypi.yaml on SonanceAI/datamint-python-api

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page