Skip to main content

dtool command line client for managing data

Project description

dtool PyPi package test Documentation Status

Make your data more resilient, portable and easy to work with by packaging files & metadata into self contained datasets.

Overview

dtool is a suite of software for managing scientific data and making it accessible programmatically. It consists of a command line interface dtool and a Python API: dtoolcore.

The dtool command line interface allows one to organise files into datasets and to move datasets between different storage solutions, for example from local disk to remote object storage. Importantly it also provides methods to verify that the transfer has been successful.

The Python API gives complete access to the data and metadata in a dataset. It makes it easy to create scripts for processing the items, or a subset of items, in a dataset. The Python API also allows datasets to be constructed programmatically.

dtool is extensible, meaning that it is possible to create plugins both for adding functionality to the command line interface and for creating interfaces to custom storage backends.

The dtool Python package is a meta package that installs the packages:

  • dtoolcore - core API

  • dtool-cli - CLI plugin scaffold

  • dtool-annotation - CLI commands for working with dataset annotations

  • dtool-config - CLI commands for configuring dtool

  • dtool-create - CLI commands for creating datasets

  • dtool-info - CLI commands for getting information about datasets

  • dtool-overlay - CLI commands for working with per item metadata stored as overlays

  • dtool-symlink - storage broker interface allowing symlinking to data

  • dtool-http - storage broker interface allowing read only access to datasets over HTTP

Installation:

$ pip install dtool

There are support packages for several object storage solutions:

  • dtool-s3 - storage broker interface to S3 object storage

  • dtool-smb - storage broker interface to smb network share

  • dtool-azure - storage broker interface to Azure Storage

  • dtool-ecs - storage broker interface to ECS S3 object storage

  • dtool-irods - storage broker interface to iRODS

If you have access to Amazon S3, Microsoft Azure, ECS S3 or iRODS storage you may also want to install support for these:

$ pip install dtool-s3 dtool-azure dtool-ecs dtool-irods

Usage:

$ dtool create my-awesome-dataset
Created proto dataset file:///Users/olssont/my-awesome-dataset
Next steps:
1. Add raw data, eg:
   dtool add item my_file.txt file:///Users/olssont/my-awesome-dataset
   Or use your system commands, e.g:
   mv my_data_directory /Users/olssont/my-awesome-dataset/data/
2. Add descriptive metadata, e.g:
   dtool readme interactive file:///Users/olssont/my-awesome-dataset
3. Convert the proto dataset into a dataset:
   dtool freeze file:///Users/olssont/my-awesome-dataset

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dtool-3.27.0.tar.gz (322.5 kB view details)

Uploaded Source

Built Distribution

dtool-3.27.0-py3-none-any.whl (4.5 kB view details)

Uploaded Python 3

File details

Details for the file dtool-3.27.0.tar.gz.

File metadata

  • Download URL: dtool-3.27.0.tar.gz
  • Upload date:
  • Size: 322.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for dtool-3.27.0.tar.gz
Algorithm Hash digest
SHA256 6a96f57dd39cff370926d61a0da8f5fea698e3504156297689321ec09cfa950d
MD5 85bc67f0580824f94151d2fa5fb5232d
BLAKE2b-256 a57a455f0e5a223950e3d9094e42b67303907c9cca42de59b84031e11a3eb6a2

See more details on using hashes here.

File details

Details for the file dtool-3.27.0-py3-none-any.whl.

File metadata

  • Download URL: dtool-3.27.0-py3-none-any.whl
  • Upload date:
  • Size: 4.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for dtool-3.27.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fd4f1a641c3ae6c0e17c8c152e9ef812e1eebaf8fb6aab7a6dcc7928114231b4
MD5 4df6a08d09351ea6d2d7a27e1c40cc3e
BLAKE2b-256 8bf060fd87714ee33090803179e770882eb7ee4809b43c9f706ad81796ad8bf9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page