Skip to main content

Open-source data platform for biology.

Project description

Stars codecov pypi

LaminDB - Open-source data platform for biology

Public beta: Close to having converged a stable API, but some breaking changes might still occur.

LaminDB is a Python library to manage data & analyses related to biology:

  • Track & query data lineage across pipelines, notebooks & app uploads.
  • Query, validate & link data batches using biological registries & ontologies.
  • Manage features & labels schema-less or schema-full.
  • Collaborate across a mesh of LaminDB instances.

If you want a UI: LaminApp is built on LaminDB. If LaminDB ~ git, LaminApp ~ GitHub.

(LaminApp, support, integration tests & schemas for an enterprise platform are available on a paid plan - on-prem or hosted by us.)

Quickstart

Installation and sign-up take no time: Run pip install lamindb and lamin signup <email> on the command line.

Init a LaminDB instance with local or cloud default storage like you'd init a git repository:

$ lamin init --storage ./mydata   # or s3://my-bucket, gs://my-bucket

Validate & register a DataFrame:

import lamindb as ln
import pandas as pd

ln.track()  # track run context in a notebook

df = pd.DataFrame({"feat1": [1, 2], "feat2": [3, 4], "perturbation": ["pert1", "pert2"]})

ln.File.from_df(df, description="Data batch 1").save()  # create a File object and save/upload it

Query & use a DataFrame:

ln.File.search("batch 1")  # run a search

file = ln.File.filter(labels="pert1").one()  # or a query (under-the-hood, you have the full power of SQL to query)

df = file.load()

Documentation

Read the docs.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lamindb-0.51a1.tar.gz (242.8 kB view hashes)

Uploaded Source

Built Distribution

lamindb-0.51a1-py3-none-any.whl (81.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page