Skip to main content

Open-Access Computational Biology Datasets

Project description

bedrock-bio

Open-Access Computational Biology Datasets

Description

Efficiently access a curated library of open-access computational biology datasets. Datasets support predicate pushdown and projection to the cloud storage backend, enabling quick, iterative access to otherwise massive, unwieldy datasets.

bedrock_bio consists of three user-facing functions:

  • list_datasets(): returns a list of available dataset identifiers
  • describe_dataset('<name>'): returns metadata, citation, and column definitions for a dataset
  • load_dataset('<name>', **filters): takes a dataset name and required partition filters, and returns a lazy DuckDB relation

DuckDB methods (filter, select, limit) can be used on the relation returned by load_dataset to push down additional row filters and column selections to the storage backend.

Installation

To install the latest release from PyPI:

pip install bedrock-bio

Or install the current development version from GitHub:

pip install git+https://github.com/bedrock-bio/bedrock-bio-client.git@main#subdirectory=python

Examples

import bedrock_bio as bb

List available datasets:

bb.list_datasets()

Describe a dataset to see its metadata, citation, and columns:

bb.describe_dataset('ukb_ppp.pqtls')

Lazily load a dataset with required partition filters, select columns, and collect into an in-memory data frame:

df = bb.load_dataset('ukb_ppp.pqtls', ancestry='EUR', protein_id='A0FGR8', panel='Inflammation') \
  .select('chromosome, position, effect_allele, other_allele, beta, neg_log_10_p_value') \
  .fetchdf()

Dataset Requests

To request the addition of a new dataset to the library, open an issue.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bedrock_bio-1.2.1.tar.gz (3.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bedrock_bio-1.2.1-py3-none-any.whl (6.0 kB view details)

Uploaded Python 3

File details

Details for the file bedrock_bio-1.2.1.tar.gz.

File metadata

  • Download URL: bedrock_bio-1.2.1.tar.gz
  • Upload date:
  • Size: 3.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.1 {"installer":{"name":"uv","version":"0.11.1","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for bedrock_bio-1.2.1.tar.gz
Algorithm Hash digest
SHA256 dcc922d8405d9707bbcecc9003dc3064af21ca4eb1040593f3d13061d215506a
MD5 24f7d0a57bd15aaef16ce5e3ffec599d
BLAKE2b-256 555d6b31feb09ae298fd2a50233abc3cbd6717f5462d7e91ebf288dc9b96a45b

See more details on using hashes here.

File details

Details for the file bedrock_bio-1.2.1-py3-none-any.whl.

File metadata

  • Download URL: bedrock_bio-1.2.1-py3-none-any.whl
  • Upload date:
  • Size: 6.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.1 {"installer":{"name":"uv","version":"0.11.1","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for bedrock_bio-1.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3e1e51bc29cf66fe3c48cb8bf12e2ab7373642d2771814321427fd3f13bdace6
MD5 66e1616f5e4a207d42f5cb0c4c060995
BLAKE2b-256 c6dc3434e127f944f04a7e210d5c7b97d5128accaf2a8d7663b05ef1e8c833a8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page