Skip to main content

ETL system for interpreting laboratory instrument data files and loading them into a standardized format while enforcing schema and retaining all metadata.

Project description

lab-etl

PyPI - Version PyPI - Python Version

This repository contains the codebase for the ETL scripts for loading laboratory instrument data files into our database. In particular, data files from a variety of formats are converted to Apache Parquet files which provides a standardized interface for access and enforces schema. Of notable importance is the inclusion of metadata in these files. Metadata is extracted from the original test files and stored as JSON-like metadata within the Parquet files in either file-wide or column-specific, as appropriate. Depending on the type of file (from which type of instrument) the keys will be standardized for common fields. Additional metadata that may be instrument-specific will be stored as additional metadata but is not guaranteed to be standardized in any meaningful way. However, the names of these fields may be slightly altered to provide clarity to the user as to what they might represent.

Development currently focuses on files and instruments of interest to FSRI's Materials Properties Laboratory but as we integrate with external stakeholders, or have the time, additional instruments and filetypes will be added. Feel free to reach out if you have a particular need for some capability or submit a PR.


Table of Contents

Installation

pip install labetl

License

labetl is distributed under the terms of the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

labetl-0.0.3.tar.gz (3.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

labetl-0.0.3-py3-none-any.whl (21.6 kB view details)

Uploaded Python 3

File details

Details for the file labetl-0.0.3.tar.gz.

File metadata

  • Download URL: labetl-0.0.3.tar.gz
  • Upload date:
  • Size: 3.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.16

File hashes

Hashes for labetl-0.0.3.tar.gz
Algorithm Hash digest
SHA256 e6275bfd03e7f6e643e55d756bed214ae417c53c5772ff56a1f917f58c480c6b
MD5 7a9a05a8db5779fc9824353004d580cd
BLAKE2b-256 2586fe69bc883b14a11f6d5ea8476f80e341436867b3459f8c8b2a2c0cd7039d

See more details on using hashes here.

File details

Details for the file labetl-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: labetl-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 21.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.16

File hashes

Hashes for labetl-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 2615f3bd478f276ee681625f45a207039e762e5f9f61b307bc45b9b08a42b470
MD5 4e58c2d5c204774e498701c8c2944067
BLAKE2b-256 fd371d35f80d8d113f36ad13fe0598bf166eaeb1e9589df9d41c803459045427

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page