Skip to main content

Datalad Metadata Model

Project description

Build status codecov PyPI version GitHub release (latest by date including pre-releases)

Datalad Metadata Model

This software implements the metadata model that datalad and datalad-metalad (from version 0.3.0) use to store metadata.

Model Elements (the model layer)

The metadata model is defined by the API of the top-level classes. Those are:

  • MetadataRootRecord -- holds top-level metadata information for a single version of a datalad dataset

  • UUIDSet -- holds metadata root records for a set of datasets that are identified by their UUIDs and their version.

  • TreeVersionList -- holds metadata root records and a sub-dataset tree for a dataset version and its sub-datasets

  • Metadata -- represents metadata for a single item, i.e. dataset or file. Metadata is associated with extractor names and extraction parameters.

  • DatasetTree -- a representation of the sub-dataset hierarchy of a dataset

  • FileTree -- a representation of the file-tree of a dataset

  • ...

Because of the large size of some datalad-datasets, e.g. tens of thousands of sub-datasets and hundres of millions of files, the implementation allows focus-based operations on individual parts of the potentially very large metadata model. The implementation uses the proxy-pattern, that means, it loads, modifies, and saves only the minimal necessary model elements that are necessary to operate on the metadata-information that the user is interested in.

Storage layer

The model elements have to be persisted on a storage backend. How the model is mapped on storage backends is defined by the storage layer, that is to a large degree independent of the model layer. The intention is to support multiple storage backends in the past.

Currently, only one storage backend is supported:

  • git-mapping -- a storage backend that stores a metadata model in a git repository. The model objects are stored outside of existing branches. They are referenced by datalad-specific git-references under refs/datalad/*

Acknowledgements

This DataLad extension was developed with support from the German Federal Ministry of Education and Research (BMBF 01GQ1905), and the US National Science Foundation (NSF 1912266).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datalad-metadata-model-0.3.11.tar.gz (69.0 kB view details)

Uploaded Source

Built Distribution

datalad_metadata_model-0.3.11-py3-none-any.whl (80.0 kB view details)

Uploaded Python 3

File details

Details for the file datalad-metadata-model-0.3.11.tar.gz.

File metadata

File hashes

Hashes for datalad-metadata-model-0.3.11.tar.gz
Algorithm Hash digest
SHA256 95a113bb0eb5a27bf61b4a0ea6111f7cb01d707c0eba98359d9b8b87fcb078b4
MD5 e8c850c64fe9f7de6f1c9901de194239
BLAKE2b-256 1a0074e4a9efa33645dfb896d2a762bb2d131c97015187a7d7568bb21576b77e

See more details on using hashes here.

File details

Details for the file datalad_metadata_model-0.3.11-py3-none-any.whl.

File metadata

File hashes

Hashes for datalad_metadata_model-0.3.11-py3-none-any.whl
Algorithm Hash digest
SHA256 c9e9b4a1d0c57e4bc57cd6c73335fda4f4c57ece7e193031784ab64f7fff57b9
MD5 82c5a1e007d9140480dee7e4d5cc0ada
BLAKE2b-256 ba6bd67f36f90effd0729cba9bec8d7f47f1c1484fa87614a6f8a274720cb946

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page