Skip to main content

PRovenance-tracking ENtity "+ "Attribute batch Computation and database Storage system

Project description

Prenacs

Prenacs is a system, which allows to run batch computations of attribute values for sets of entities, and to store the computation results in a database, alongside metadata which allows to track the data provenance.

The name is an acronym for "PRovenance-tracking ENtity Attribute Computation and Storage system".

Key concepts

Entities are thereby object of any kind, characterized by an identifier, which allows distinguishing the object from other objects of the same kind. For each entity, ProBatch allows to compute the values of attributes.

Attribute are properties of the entity. Anything whose value can be determined using a computation (in a general sense, thus e.g. also obtaining the value from an external data source) is an attribute. Each attribute may consist of a single value (scalar attribute) or or multiple values (composite attribute). The attribute system is open and flexible, i.e. new attributes can be added at any moment.

ProvBatch allows to store a single instance of each attribute for an entity: that is, it is not meant to store multiple measurements of a value or to store a journal of previous attribute computation results.

Alongside the computation results, ProBatch allows to track the provenance of attribute values. For achieving this, the code for attribute value computation must be implemented in form of a computation plugin. Multiple plugins may be available for computing the same attribute. A plugin may compute one or multiple attributes. Plugins contain the code for the attribute computation, as well as metadata, describing the input, output, supported paramaters and methods and implementation notes. Whenever a plugin code changes, a new plugin version number is assigned.

Batch computation metadata instances are identified by an unique identifier (UUID) and include a reference to a given plugin (plugin identifier and version number), alongside information such as the computation parameters, timestamps, identifier of the user starting the computation, and key system data. Thus, storing the batch computation UUIDs alongside with computation results, ProvBatch allows to keep track of the data provenance.

Related libraries

Prenacs has been developed as part of an ecosystem of Python libraries, including:

  • multiplug, which implements the infrastructure for the plugin system
  • attrtables, which implements the infrastructure for the storage of attribute values and metadata
  • snacli, which is used for implementing double-purpose scripts, callable interactively from the command line, as well as inside a Snakemake pipeline

Usage

The usage of the library is explained in the user manual.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

prenacs-1.2.tar.gz (35.8 kB view details)

Uploaded Source

Built Distribution

prenacs-1.2-py3-none-any.whl (27.3 kB view details)

Uploaded Python 3

File details

Details for the file prenacs-1.2.tar.gz.

File metadata

  • Download URL: prenacs-1.2.tar.gz
  • Upload date:
  • Size: 35.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.5

File hashes

Hashes for prenacs-1.2.tar.gz
Algorithm Hash digest
SHA256 8efda18223d9c902549d6444db5d42583e45af6e27c8091bf1d8c4e5c6385dc7
MD5 214b814800f69edbeabf94a92c4cc3ac
BLAKE2b-256 fd3b84ced438d3b58f524c1a88232049224ee75863a10bb5dfe3ea59794ee419

See more details on using hashes here.

File details

Details for the file prenacs-1.2-py3-none-any.whl.

File metadata

  • Download URL: prenacs-1.2-py3-none-any.whl
  • Upload date:
  • Size: 27.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.5

File hashes

Hashes for prenacs-1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 3a69ab0d5c827d97a6ef0b518cbf1626be3ca73ff1d8543f4eaabcf4c54a30a9
MD5 7b09145a2c3349fc2ff3065169b7e28e
BLAKE2b-256 67f5aba5750749913322b251c2b2987e63ae5ebecf9ede98e4a95985f2ceac11

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page