Skip to main content

datalab is a research data management platform for materials science and chemistry.

Project description

datalab

datalab is a user-friendly, open-source platform that can capture all the experimental data and metadata produced in a scientific lab, targeted (broadly) at materials chemistry but with customisability and extensability in mind. datalab records data and metadata securely and makes it accessible and reusable by both humans and machines via the web UI and API, respectively. datalab can be self-hosted and managed deployments are also available.

You can try the demo deployment at demo.datalab-org.io and read the online documentation at docs.datalab-org.io with release notes and changelog available on GitHub and online.

Features:

  • Capture and store sample and device metadata
  • Connect and sync raw data directly and from laboratory instruments
  • Built-in support for multiple characterisation techniques (XRD, NMR, echem, TEM, TGA, Mass Spec, Raman and more).
  • Capture scientific context: store the graph of relationships between research objects.
  • Python API for programmatic access to your lab's data enabling custom analysis and automation.
  • Join the datalab federation: you can add your datalab to the federation for additional shared features.
  • Plugin ecosystem allowing for custom data blocks, AI integration and other instance-specific code.
  • Deployment and infrastructure automation via Ansible playbooks.

[!NOTE] You may be looking for the identically named project DataLab for signal processing, which also has plugins, clients and other similar concepts!

Getting started

To set up your own datalab instance or to get started with development, you can follow the installation and deployment instructions in the online documentation.

We can also provide paid managed deployments via datalab industries ltd.: contact us at hello@datalab.industries.

Design philosophy and architecture

The datalab architecture is shown below:

graph TD
classDef actor fill:#0066CC,fill-opacity:0.3,stroke:#333,stroke-width:2px,color:#000;
classDef clientInterface fill:#00AA44,fill-opacity:0.3,stroke:#333,stroke-width:2px,color:#000;
classDef coreComponent fill:#FF6600,fill-opacity:0.3,stroke:#333,stroke-width:2px,color:#000;
classDef umbrellaLabel fill:#666666,fill-opacity:0.3,stroke:#666,stroke-width:1px,color:#000,rx:5,ry:5,text-align:center;
classDef subgraphStyle fill:#f9f9f9,fill-opacity:0.1,stroke:#ccc,stroke-width:1px;

    subgraph ExternalActors [External actors]
        direction TB
        User[User]
        Machine[Machine]
    end
    class User,Machine actor;
    class ExternalActors subgraphStyle;

    UmbrellaDesc["Raw instrument data,<br>annotations, connections"]
    class UmbrellaDesc umbrellaLabel;

    subgraph ClientInterfaces [Client interfaces]
        direction TB
        BrowserApp[_datalab_<br>Browser app]
        PythonAPI[_datalab_<br>Python API]
    end
    class BrowserApp,PythonAPI clientInterface;
    class ClientInterfaces subgraphStyle;

    subgraph Backend
        direction TB
        RESTAPI[_datalab_<br>REST API]
        MongoDB[MongoDB Database]
        DataLake[Data Lake]
    end
    class RESTAPI,MongoDB,DataLake coreComponent;
    class Backend subgraphStyle;

    User      <-- "User data I/O" --> UmbrellaDesc;
    Machine   <-- "Machine data I/O" --> UmbrellaDesc;

    UmbrellaDesc <-- "_via_ GUI" --> BrowserApp;
    UmbrellaDesc <-- "_via_ scripts" --> PythonAPI;

    BrowserApp  <-- "HTTP (Data exchange)" --> RESTAPI;
    PythonAPI   <-- "API calls (Data exchange)" --> RESTAPI;

    RESTAPI <-- "Annotations, connections" --> MongoDB;
    RESTAPI <-- "Raw and structured characterisation data" --> DataLake;

    linkStyle 0 stroke:#666,stroke-width:3px
    linkStyle 1 stroke:#666,stroke-width:3px
    linkStyle 2 stroke:#666,stroke-width:3px
    linkStyle 3 stroke:#666,stroke-width:3px
    linkStyle 4 stroke:#666,stroke-width:3px
    linkStyle 5 stroke:#666,stroke-width:3px
    linkStyle 6 stroke:#666,stroke-width:3px
    linkStyle 7 stroke:#666,stroke-width:3px

    click PythonAPI "https://github.com/datalab-org/datalab-api" "datalab Python API on GitHub" _blank
    click BrowserApp "https://github.com/datalab-org/datalab/tree/main/webapp" "datalab Browser App on GitHub" _blank
    click RESTAPI "https://github.com/datalab-org/datalab/tree/main/pydatalab" "pydatalab REST API on GitHub" _blank

The main aim of datalab is to provide a platform for capturing the significant amounts of long-tail experimental data and metadata produced in a typical lab, and enable storage, filtering and future data re-use by humans and machines. datalab is targeted (broadly) at materials chemistry labs but with customisability and extensability in mind.

The platform provides researchers with a way to record sample- and cell-specific metadata, attach and sync raw data from instruments, and perform analysis and visualisation of many characterisation techniques in the browser (XRD, NMR, electrochemical cycling, TEM, TGA, Mass Spec, Raman).

Importantly, datalab stores a network of interconnected research objects in the lab, such that individual pieces of data are stored with the context needed to make them scientifically useful.

License

This software is released under the conditions of the MIT license. Please see LICENSE for the full text of the license.

Contact

We are available for consultations on setting up and managing datalab deployments, as well as collaborating on or sponsoring additions of new features and techniques. Please contact Josh or Matthew on their academic emails, or join the public datalab Slack workspace.

Contributions

This software was conceived and developed by:

with support from the group of Professor Clare Grey (University of Cambridge), and major contributions from:

plus many contributions, feedback and testing performed by other members of the community, in particular, the groups of Prof Matt Cliffe (University of Nottingham) and Dr Peter Kraus (TUBerlin) and the company Matgenix SRL.

A full list of code contributions can be found on GitHub.

Funding

Contributions to datalab have been supported by a mixture of academic funding and consultancy work through datalab industries ltd.

In particular, the developers thank:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datalab_server-0.6.7.tar.gz (4.6 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

datalab_server-0.6.7-py3-none-any.whl (145.3 kB view details)

Uploaded Python 3

File details

Details for the file datalab_server-0.6.7.tar.gz.

File metadata

  • Download URL: datalab_server-0.6.7.tar.gz
  • Upload date:
  • Size: 4.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for datalab_server-0.6.7.tar.gz
Algorithm Hash digest
SHA256 76de7af7538bb97e8736585d331875f1f3e66964b33cd835d36931cc69ad4a9f
MD5 93b87958d3647f4e2f0135f20a703ff4
BLAKE2b-256 902fc2eab0dfeed15fe84f20556b7a8e9e8f9ac47c3a8c9746c1b40c7e800e10

See more details on using hashes here.

Provenance

The following attestation bundles were made for datalab_server-0.6.7.tar.gz:

Publisher: release.yml on datalab-org/datalab

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file datalab_server-0.6.7-py3-none-any.whl.

File metadata

  • Download URL: datalab_server-0.6.7-py3-none-any.whl
  • Upload date:
  • Size: 145.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for datalab_server-0.6.7-py3-none-any.whl
Algorithm Hash digest
SHA256 8b0b40539b4790a069044358fce0308dfea7651a328a7eca2cab790021be45a3
MD5 9f818717cc324cb8a682c82127a0ed5c
BLAKE2b-256 4f3e12385a5091cf1ef8c3ba19178d2c5b7f0daa43da20fe50c4faa1aa462264

See more details on using hashes here.

Provenance

The following attestation bundles were made for datalab_server-0.6.7-py3-none-any.whl:

Publisher: release.yml on datalab-org/datalab

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page