The Modular Autonomous Discovery for Science (MADSci) Data Manager.

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

luckierdodge

Project description

MADSci Data Manager

Handles capturing, storing, and querying data generated during experiments - both JSON values and files.

MADSci Data Manager Diagram

Features

DataPoint storage: JSON values and files with metadata
Flexible storage: Local filesystem or S3-compatible object storage (MinIO, AWS S3, GCS)
Rich metadata: Ownership info, timestamps, custom labels
Queryable: Search by value and metadata
Cloud integration: Multi-provider cloud storage support

Installation

See the main README for installation options. This package is available as:

PyPI: pip install madsci.data_manager
Docker: Included in ghcr.io/ad-sdl/madsci
Example configuration: See example_lab/managers/example_data.manager.yaml

Dependencies: MongoDB database, optional MinIO/S3 storage (see example_lab)

Usage

Quick Start

Use the example_lab as a starting point:

# Start with working example
docker compose up  # From repo root
# Data Manager available at http://localhost:8004/docs

# Or run standalone
python src/madsci_data_manager/madsci/data_manager/data_server.py

Manager Setup

For custom deployments, see example_data.manager.yaml for configuration options.

Data Client

Use DataClient to store and retrieve experimental data:

from madsci.client.data_client import DataClient
from madsci.common.types.datapoint_types import DataPoint, DataPointTypeEnum
from datetime import datetime

client = DataClient(data_server_url="http://localhost:8004")

# Store JSON data
value_dp = DataPoint(
    label="Temperature Reading",
    data_type=DataPointTypeEnum.JSON,
    value={"temperature": 23.5, "unit": "Celsius"}
)
submitted = client.submit_datapoint(value_dp)

# Store files
file_dp = DataPoint(
    label="Experiment Log",
    data_type=DataPointTypeEnum.FILE,
    path="/path/to/data.txt"
)
submitted_file = client.submit_datapoint(file_dp)

# Retrieve data
retrieved = client.get_datapoint(submitted.datapoint_id)

# Save file locally
client.save_datapoint_value(submitted_file.datapoint_id, "/local/save/path.txt")

Examples: See example_lab/notebooks/experiment_notebook.ipynb for data management workflows.

Storage Configuration

Local Storage (Default)

Files stored on filesystem with date-based hierarchy
Simple setup, no additional dependencies
File paths stored in MongoDB database

Object Storage (S3-Compatible)

Supports cloud and self-hosted storage providers:

AWS S3
Google Cloud Storage (with HMAC keys)
MinIO (self-hosted or cloud)
Any S3-compatible service

Benefits:

Automatic upload with fallback to local storage
Better for large files and distributed setups
Built-in metadata and versioning support

Quick Setup

# Use example_lab with pre-configured MinIO
docker compose up  # From repo root
# MinIO Console: http://localhost:9001 (minioadmin/minioadmin)

Configuration Examples

AWS S3:

from madsci.common.types.datapoint_types import ObjectStorageSettings

aws_config = ObjectStorageSettings(
    endpoint="s3.amazonaws.com",
    access_key="YOUR_ACCESS_KEY",
    secret_key="YOUR_SECRET_KEY",
    secure=True,
    default_bucket="my-bucket",
    region="us-east-1"
)
client = DataClient(object_storage_settings=aws_config)

Google Cloud Storage:

gcs_config = ObjectStorageSettings(
    endpoint="storage.googleapis.com",
    access_key="YOUR_HMAC_ACCESS_KEY",
    secret_key="YOUR_HMAC_SECRET",
    secure=True,
    default_bucket="my-gcs-bucket"
)

Direct Object Storage DataPoints

from madsci.common.types.datapoint_types import DataPoint, DataPointTypeEnum

storage_dp = DataPoint(
    label="Large Dataset",
    data_type=DataPointTypeEnum.OBJECT_STORAGE,
    path="/path/to/data.parquet",
    bucket_name="my-bucket",
    object_name="datasets/data.parquet",
    custom_metadata={"version": "v2.1"}
)
uploaded = client.submit_datapoint(storage_dp)

Authentication: Use IAM users/service accounts with appropriate storage permissions. See cloud provider documentation for detailed setup.

Project details

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

luckierdodge

Release history Release notifications | RSS feed

0.7.1

Mar 10, 2026

0.7.0

Mar 5, 2026

0.6.1

Jan 1, 2026

0.6.0

Dec 15, 2025

0.6.0rc7 pre-release

Dec 15, 2025

0.6.0rc6 pre-release

Dec 15, 2025

0.6.0rc5 pre-release

Dec 8, 2025

0.6.0rc4 pre-release

Dec 8, 2025

0.6.0rc3 pre-release

Dec 5, 2025

0.6.0rc2 pre-release

Dec 5, 2025

0.6.0rc1 pre-release

Dec 5, 2025

0.5.4

Nov 19, 2025

0.5.3

Nov 11, 2025

This version

0.5.2

Nov 11, 2025

0.5.1

Nov 7, 2025

0.5.0

Oct 27, 2025

0.5.0rc3 pre-release

Oct 27, 2025

0.5.0rc2 pre-release

Oct 17, 2025

0.5.0rc1 pre-release

Oct 17, 2025

0.4.7

Aug 18, 2025

0.4.6

Aug 15, 2025

0.4.5

Aug 14, 2025

0.4.4

Aug 8, 2025

0.4.3

Jun 25, 2025

0.4.2

Jun 19, 2025

0.4.1

Jun 17, 2025

0.4.0

Jun 7, 2025

0.3.1

May 28, 2025

0.3.0

May 12, 2025

0.2.1

Apr 25, 2025

0.2.0

Apr 11, 2025

0.1.9

Apr 2, 2025

0.1.8

Mar 31, 2025

0.1.7

Mar 25, 2025

0.1.6

Mar 21, 2025

0.1.5

Mar 20, 2025

0.1.4

Mar 19, 2025

0.1.3

Mar 18, 2025

0.1.2

Mar 18, 2025

0.1.1

Mar 18, 2025

0.1.0

Mar 11, 2025

0.0.4

Mar 3, 2025

0.0.3

Mar 3, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

madsci_data_manager-0.5.2.tar.gz (15.1 kB view details)

Uploaded Nov 11, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

madsci_data_manager-0.5.2-py3-none-any.whl (6.0 kB view details)

Uploaded Nov 11, 2025 Python 3

File details

Details for the file madsci_data_manager-0.5.2.tar.gz.

File metadata

Download URL: madsci_data_manager-0.5.2.tar.gz
Upload date: Nov 11, 2025
Size: 15.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: pdm/2.26.1 CPython/3.9.24 Linux/6.11.0-1018-azure

File hashes

Hashes for madsci_data_manager-0.5.2.tar.gz
Algorithm	Hash digest
SHA256	`c4c261c609bc7e300fc394be2a8fbeb172633c9063c5413fd0ceda2b918020f6`
MD5	`8db1aeb1a130b0c280ac8c9e13ec06b8`
BLAKE2b-256	`5c8916a32afb85e09939067394f059fd3c7b10dfcc1b785c21fb9354d4d70d85`

See more details on using hashes here.

File details

Details for the file madsci_data_manager-0.5.2-py3-none-any.whl.

File metadata

Download URL: madsci_data_manager-0.5.2-py3-none-any.whl
Upload date: Nov 11, 2025
Size: 6.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: pdm/2.26.1 CPython/3.9.24 Linux/6.11.0-1018-azure

File hashes

Hashes for madsci_data_manager-0.5.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`db3d7e844381cd2f0c78f828e66fb6dc9e76425cf148310a56ba6d516603af95`
MD5	`2ab744a96d7a84613330bb80d8bf70f1`
BLAKE2b-256	`015b0e3e765e38fd8f89271a0f193a4698c59869a3ee5e42df20298e196ecad1`

See more details on using hashes here.

madsci.data_manager 0.5.2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Project description

MADSci Data Manager

Features

Installation

Usage

Quick Start

Manager Setup

Data Client

Storage Configuration

Local Storage (Default)

Object Storage (S3-Compatible)

Quick Setup

Configuration Examples

Direct Object Storage DataPoints

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes