Skip to main content

A data organization and compilation system.

Project description

Datamate

Datamate is a data and configuration management framework in Python for machine-learning research. It uses the filesystem as memory through Directory objects, providing a programming interface to store and retrieve files in hierarchical structures using HDF5.

Main Features

  • Filesystem as memory through Directory objects
  • Hierarchical data organization
  • Automatic path handling and resolution with pathlib
  • Array storage in HDF5 format
  • Parallel read/write operations
  • Configuration-based compilation and access of data
  • Configuration management in YAML files
  • Configuration comparison and diffing
  • Pandas DataFrame integration
  • Directory structure visualization (tree view)
  • Experiment status tracking

Example

import datamate
import numpy as np

# Set up experiment directory
datamate.set_root_dir("./experiments")

# Set up experiment configuration
config = {
    "model": "01",
    "optimizer": "Adam",
    "learning_rate": 0.001,
    "n_epochs": 100
}

# Set up experiment directory and store configuration
exp = datamate.Directory("vision_study/model_01", config)

# Store arrays as HDF5 files
exp.images = np.random.rand(100, 64, 64)  # stored as images.h5
exp.responses = np.zeros((100, 1000))     # stored as responses.h5

# Access data
mean_response = exp.responses[:].mean()

More detailed examples in the documentation.

Installation

Using pip:

pip install datamate

Documentation

Full documentation is available at flyvis.github.io/datamate.

Related Projects

  • flyvis - Usage example of datamate
  • artisan - The framework that inspired datamate

Contributing

Contributions welcome! Please check our Contributing Guide for guidelines.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datamate-1.0.0.tar.gz (463.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

datamate-1.0.0-py3-none-any.whl (33.6 kB view details)

Uploaded Python 3

File details

Details for the file datamate-1.0.0.tar.gz.

File metadata

  • Download URL: datamate-1.0.0.tar.gz
  • Upload date:
  • Size: 463.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for datamate-1.0.0.tar.gz
Algorithm Hash digest
SHA256 87a978514b1dadfbd88284733f5f6d02578ade2c29b92cfa93f46e93a2f344c7
MD5 1849f889c60bf36bac15e3cf1f209c14
BLAKE2b-256 1bc1a9fa4cd2c560cbd3a7c7250f7841055fd88c4e1e35fdec1357662336d624

See more details on using hashes here.

File details

Details for the file datamate-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: datamate-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 33.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for datamate-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0eb22107785354b81563915cfd3c946ee933c2a7bc8fc9af935cc2fe5494a5ab
MD5 3d79da969d936905027ac3d034cbd833
BLAKE2b-256 89aa8a7bf0eea47a45edee6a3cbdfd95c5b4ad3a699003b029c6f63efc1bc1ab

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page