Skip to main content

Open-Access Computational Biology Datasets

Project description

bedrock-bio

Open-Access Computational Biology Datasets

Description

Efficiently access a curated library of open-access computational biology datasets. Datasets support predicate pushdown and projection to the cloud storage backend, enabling quick, iterative access to otherwise massive, unwieldy datasets.

bedrock_bio consists of three user-facing functions:

  • list_datasets(): returns a list of available dataset identifiers
  • describe_dataset('<name>'): returns metadata, citation, and column definitions for a dataset
  • load_dataset('<name>', **filters): takes a dataset name and required partition filters, and returns a lazy DuckDB relation

DuckDB methods (filter, select, limit) can be used on the relation returned by load_dataset to push down additional row filters and column selections to the storage backend.

Installation

To install the latest release from PyPI:

pip install bedrock-bio

Or install the current development version from GitHub:

pip install git+https://github.com/bedrock-bio/bedrock-bio-client.git@main#subdirectory=python

Examples

import bedrock_bio as bb

List available datasets:

bb.list_datasets()

Describe a dataset to see its metadata, citation, and columns:

bb.describe_dataset('ukb_ppp.pqtls')

Lazily load a dataset with required partition filters, select columns, and collect into an in-memory data frame:

df = bb.load_dataset('ukb_ppp.pqtls', ancestry='EUR', protein_id='A0FGR8', panel='Inflammation') \
  .select('chromosome, position, effect_allele, other_allele, beta, neg_log_10_p_value') \
  .fetchdf()

Dataset Requests

To request the addition of a new dataset to the library, open an issue.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bedrock_bio-1.2.0.tar.gz (3.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bedrock_bio-1.2.0-py3-none-any.whl (5.9 kB view details)

Uploaded Python 3

File details

Details for the file bedrock_bio-1.2.0.tar.gz.

File metadata

  • Download URL: bedrock_bio-1.2.0.tar.gz
  • Upload date:
  • Size: 3.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.1 {"installer":{"name":"uv","version":"0.11.1","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for bedrock_bio-1.2.0.tar.gz
Algorithm Hash digest
SHA256 b1edaf4494f075a73b47959dff357c826bf7fcb52803eb38b7ac7620df19bf60
MD5 9fb26aa4bb82fc1e2fcaedb4d4e1ec88
BLAKE2b-256 81245a1feb681abb4273633d9b19bf8b8133f98370266cf4821810d358d51dbe

See more details on using hashes here.

File details

Details for the file bedrock_bio-1.2.0-py3-none-any.whl.

File metadata

  • Download URL: bedrock_bio-1.2.0-py3-none-any.whl
  • Upload date:
  • Size: 5.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.1 {"installer":{"name":"uv","version":"0.11.1","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for bedrock_bio-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 924d0e692670584372c4e0a0f77ff1ebc551898f8878d94fb5bbeb3fc01b8a25
MD5 1dffe34fb02b1cc9f814ecf9233c908e
BLAKE2b-256 f6c0b89e4ecf08d3c32c4146b0a266f464adfbc839aeb3b0390baa7930c34eb5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page