Skip to main content

A library for interacting with the Datamint API, designed for efficient data management, processing and Deep Learning workflows.

Project description

Datamint Python API

Build Status Python 3.10+

A comprehensive Python SDK for interacting with the Datamint platform, providing seamless integration for medical imaging workflows, dataset management, and machine learning experiments.

📋 Table of Contents

🚀 Features

  • Dataset Management: Download, upload, and manage medical imaging datasets
  • Annotation Tools: Create, upload, and manage annotations (segmentations, labels, measurements)
  • Experiment Tracking: Integrated MLflow support for experiment management
  • PyTorch Lightning Integration: Streamlined ML workflows with Lightning DataModules and callbacks
  • DICOM Support: Native handling of DICOM files with anonymization capabilities
  • Multi-format Support: PNG, JPEG, NIfTI, and other medical imaging formats

See the full documentation at https://sonanceai.github.io/datamint-python-api/

📦 Installation

[!NOTE] We recommend using a virtual environment to avoid package conflicts.

From PyPI

To be released soon

From Source

pip install git+https://github.com/SonanceAI/datamint-python-api

Virtual Environment Setup

Click to expand virtual environment setup instructions

We recommend that you install Datamint in a dedicated virtual environment, to avoid conflicting with your system packages. For instance, create the enviroment once with python3 -m venv datamint-env and then activate it whenever you need it with:

  1. Create the environment (one-time setup):

    python3 -m venv datamint-env
    
  2. Activate the environment (run whenever you need it):

    Platform Command
    Linux/macOS source datamint-env/bin/activate
    Windows CMD datamint-env\Scripts\activate.bat
    Windows PowerShell datamint-env\Scripts\Activate.ps1
  3. Install the package:

    pip install git+https://github.com/SonanceAI/datamint-python-api
    

Setup API key

To use the Datamint API, you need to setup your API key (ask your administrator if you don't have one). Use one of the following methods to setup your API key:

Method 1: Command-line tool (recommended)

Run datamint-config in the terminal and follow the instructions. See command_line_tools for more details.

Method 2: Environment variable

Specify the API key as an environment variable.

Bash:

export DATAMINT_API_KEY="my_api_key"
# run your commands (e.g., `datamint-upload`, `python script.py`)

Python:

import os
os.environ["DATAMINT_API_KEY"] = "my_api_key"

📚 Documentation

Resource Description
🚀 Getting Started Step-by-step setup and basic usage
📖 API Reference Complete API documentation
🔥 PyTorch Integration ML workflow integration
💡 Examples Practical usage examples

🔗 Key Components

Dataset Management

from datamint import Dataset

# Load dataset with annotations
dataset = Dataset(
    project_name="medical-segmentation",
)

# Access data
for sample in dataset:
    image = sample['image']       # torch.Tensor
    mask = sample['segmentation'] # torch.Tensor (if available)
    metadata = sample['metainfo'] # dict

PyTorch Lightning Integration

import lightning as L
from datamint.lightning import DatamintDataModule
from datamint.mlflow.lightning.callbacks import MLFlowModelCheckpoint

# Data module
datamodule = DatamintDataModule(
    project_name="your-project",
    batch_size=16,
    train_split=0.8
)

# ML tracking callback
checkpoint_callback = MLFlowModelCheckpoint(
    monitor="val_loss",
    save_top_k=1,
    register_model_name="best-model"
)

# Trainer with MLflow logging
trainer = L.Trainer(
    max_epochs=100,
    callbacks=[checkpoint_callback],
    logger=L.pytorch.loggers.MLFlowLogger(
        experiment_name="medical-segmentation"
    )
)

Annotation Management

# Upload segmentation masks
api.upload_segmentations(
    resource_id="resource-123",
    file_path="segmentation.nii.gz",
    name="liver_segmentation",
    frame_index=0
)

# Add categorical annotations
api.add_image_category_annotation(
    resource_id="resource-123",
    identifier="diagnosis",
    value="positive"
)

# Add geometric annotations
api.add_line_annotation(
    point1=(10, 20),
    point2=(50, 80),
    resource_id="resource-123",
    identifier="measurement",
    frame_index=5
)

🛠️ Command Line Tools

Upload Resources

Upload DICOM files with anonymization:

datamint-upload \
    --path /path/to/dicoms \
    --recursive \
    --channel "training-data" \
    --anonymize \
    --publish

Upload with segmentation masks:

datamint-upload \
    --path /path/to/images \
    --segmentation_path /path/to/masks \
    --segmentation_names segmentation_config.yaml

Configuration Management

# Interactive setup
datamint-config

# Set API key
datamint-config --api-key "your-key"

🔍 Examples

Medical Image Segmentation Pipeline

import torch
import lightning as L
from datamint.lightning import DatamintDataModule
from datamint.mlflow.lightning.callbacks import MLFlowModelCheckpoint

class SegmentationModel(L.LightningModule):
    def __init__(self):
        super().__init__()
        # Model definition...
    
    def training_step(self, batch, batch_idx):
        # Training logic...
        pass

# Setup data
datamodule = DatamintDataModule(
    project_name="liver-segmentation",
    batch_size=8,
    train_split=0.8
)

# Setup model with MLflow tracking
model = SegmentationModel()
checkpoint_cb = MLFlowModelCheckpoint(
    monitor="val_dice",
    mode="max",
    register_model_name="liver-segmentation-model"
)

# Train
trainer = L.Trainer(
    max_epochs=50,
    callbacks=[checkpoint_cb],
    logger=L.pytorch.loggers.MLFlowLogger()
)
trainer.fit(model, datamodule)

🆘 Support

Full Documentation GitHub Issues

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datamint-2.4.2.tar.gz (131.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

datamint-2.4.2-py3-none-any.whl (156.8 kB view details)

Uploaded Python 3

File details

Details for the file datamint-2.4.2.tar.gz.

File metadata

  • Download URL: datamint-2.4.2.tar.gz
  • Upload date:
  • Size: 131.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for datamint-2.4.2.tar.gz
Algorithm Hash digest
SHA256 ff62aa4fae88b310541e2f96c40c09d9795b33a143135db7a3a4aedd74a08d74
MD5 e3505aea1e00d91d893ac3656989fa44
BLAKE2b-256 63987a9137a56651413ba578f313e68067a1d1a8b251516c5f5b97bd07dee955

See more details on using hashes here.

Provenance

The following attestation bundles were made for datamint-2.4.2.tar.gz:

Publisher: release_pypi.yaml on SonanceAI/datamint-python-api

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file datamint-2.4.2-py3-none-any.whl.

File metadata

  • Download URL: datamint-2.4.2-py3-none-any.whl
  • Upload date:
  • Size: 156.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for datamint-2.4.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e0fadc5f38c24f3ccbcb8b8d313d1f9a77805f70295bebe28a1f90446e7b9c33
MD5 f82e0ca57e9c8ccf7875ca28ea36f7e0
BLAKE2b-256 ab5ab7335dc5ade23bd2b9e79f9c3738af392ae9ccb6a851f60b80735e97d8d1

See more details on using hashes here.

Provenance

The following attestation bundles were made for datamint-2.4.2-py3-none-any.whl:

Publisher: release_pypi.yaml on SonanceAI/datamint-python-api

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page