Biobricks automates bioinformatics data.
Project description
BioBricks
BioBricks makes loading data from biological datasets easy.
pip install biobricks
initialize
To initialize BioBricks you must set the BBLIB
environmental variable and get a user token.
TOKEN
: register at biobricks.ai then go to biobricks.ai/tokenBBLIB
: Set this to a path on your local file system with plenty of space for large bricks
import biobricks as bb
os.environ['BBLIB'] = '/some/path' # typically set this up to persist between python sessions
bb.initialize(<TOKEN>) # see step 1 above
Pull Bricks
To download a brick and save it locally in your library use bb.pull
. An example using the Tox21 dataset:
bb.pull('tox21') # save the brick to BBLIB and download it's resources
tox21 = bb.load('tox21') # load a SimpleNamespace with all the brick tables
# List the resources in the brick
for tablename in sorted(list(vars(tox21).keys())):
print(tablename)
tox21.tox21_ache_p4.to_pandas() # get a pyarrow Table and convert to pandas dataframe
To list the bricks currently available visit github.com/biobricks-ai
How does this all work?
Installing biobricks creates a BBLIB directory with a .cache subdirectory and many commit hashes representing bricks. The .cache directory is managed by dvc and stores the brick assets that are symlinked in bricks. The commit hashes are git repos referenced by their sha. The structure looks like this:
BBLIB/
.cache/ # managed by dvc, stores brick assets symlinked in bricks
74aed53360e5a278931b2f8eac0702f28fd444e4/ # a git repo
0aeb15ffa06be6c43ec5b654f6a8ff6ea4fa2bef/ # a git repo
...
When writing code it is desirable to load brick assets by repository names, org/repo syntax, or by commit hash. For example, to load biobricks tox21 you could use:
import biobricks as bb
# Load the 'latest' version of the brick
tox21 = bb.load("tox21")
tox21 = bb.load("biobricks-ai/tox21")
# Load a specific version
tox21 = bb.load("biobricks-ai/tox21/74aed53360e5a278931b2f8eac0702f28fd444e4")
tox21 = bb.load("https://github.com/biobricks-ai/tox21#74aed53360e5a278931b2f8eac0702f28fd444e4")
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for biobricks-0.1.42-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8bd0bf09aa7cdcdccf3f97bc9f8f435da713811b10e9e0ec0cff5da5ae3c818b |
|
MD5 | 721af46a87d3d3503b57bb2045b8f148 |
|
BLAKE2b-256 | 6abd85290c037d8a53696ae2d5729917760350ab189b0253ff98f765bb6dd634 |