Flexible dataframe representation to support nested structures.
Project description
BiocFrame
This package provides BiocFrame
class, an alternative to Pandas DataFrame's.
BiocFrame
makes no assumption on the types of the columns, the minimum requirement is each column implements length: __len__
and slice: __getitem__
dunder methods. This allows BiocFrame
to accept nested representations or any supported class as columns.
To get started, install the package from PyPI
pip install biocframe
Usage
To construct a BiocFrame
object, simply provide the data as a dictionary.
from random import random
from biocframe import BiocFrame
obj = {
"ensembl": ["ENS00001", "ENS00002", "ENS00003"],
"symbol": ["MAP1A", "BIN1", "ESR1"],
}
bframe = BiocFrame(obj)
print(bframe)
## output
BiocFrame with 3 rows and 2 columns
ensembl symbol
<list> <list>
[0] ENS00001 MAP1A
[1] ENS00002 BIN1
[2] ENS00003 ESR1
You can specify complex representations as columns, for example
obj = {
"ensembl": ["ENS00001", "ENS00002", "ENS00002"],
"symbol": ["MAP1A", "BIN1", "ESR1"],
"ranges": BiocFrame({
"chr": ["chr1", "chr2", "chr3"],
"start": [1000, 1100, 5000],
"end": [1100, 4000, 5500]
}),
}
bframe2 = BiocFrame(obj, row_names=["row1", "row2", "row3"])
print(bframe2)
## output
BiocFrame with 3 rows and 3 columns
ensembl symbol ranges
<list> <list> <BiocFrame>
row1 ENS00001 MAP1A chr1:1000:1100
row2 ENS00002 BIN1 chr2:1100:4000
row3 ENS00002 ESR1 chr3:5000:5500
Properties
Properties can be accessed directly from the object, for e.g. column names, row names and/or dimensions of the BiocFrame
.
# Dimensionality or shape
print(bframe.dims)
## output
## (3, 2)
# get the column names
print(bframe.column_names)
## output
## ['ensembl', 'symbol']
Setters
To set various properties
# set new column names
bframe.column_names = ["column1", "column2"]
print(bframe)
## output
BiocFrame with 3 rows and 2 columns
column1 column2
<list> <list>
[0] ENS00001 MAP1A
[1] ENS00002 BIN1
[2] ENS00003 ESR1
To add new columns,
bframe["score"] = range(2, 5)
print(bframe)
## output
BiocFrame with 3 rows and 3 columns
column1 column2 score
<list> <list> <range>
[0] ENS00001 MAP1A 2
[1] ENS00002 BIN1 3
[2] ENS00003 ESR1 4
Subset BiocFrame
Use the subset ([]
) operator to slice the object,
sliced = bframe[1:2, [True, False, False]]
print(sliced)
## output
BiocFrame with 1 row and 1 column
column1
<list>
[0] ENS00002
This operation accepts different slice input types, you can either specify a boolean vector, a slice
object, a list of indices, or row/column names to subset.
Combine
BiocFrame
implements the combine generic from biocgenerics. To combine multiple objects,
bframe1 = BiocFrame(
{
"odd": [1, 3, 5, 7, 9],
"even": [0, 2, 4, 6, 8],
}
)
bframe2 = BiocFrame(
{
"odd": [11, 33, 55, 77, 99],
"even": [0, 22, 44, 66, 88],
}
)
from biocgenerics.combine import combine
combined = combine(bframe1, bframe2)
# OR an object oriented approach
combined = bframe1.combine(bframe2)
## output
BiocFrame with 10 rows and 2 columns
odd even
<list> <list>
[0] 1 0
[1] 3 2
[2] 5 4
[3] 7 6
[4] 9 8
[5] 11 0
[6] 33 22
[7] 55 44
[8] 77 66
[9] 99 88
For more details, check out the BiocFrame class reference.
Note
This project has been set up using PyScaffold 4.5. For details and usage information on PyScaffold see https://pyscaffold.org/.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file BiocFrame-0.3.20.tar.gz
.
File metadata
- Download URL: BiocFrame-0.3.20.tar.gz
- Upload date:
- Size: 38.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.18
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ae8971ac52aeb6b9230264d7608403dec38f23fcd38f462b12be46b39b31ce54 |
|
MD5 | b993e5d98f8a7066e463115b9a3a6f49 |
|
BLAKE2b-256 | 8bd52662e9dc39dbca85c550f00d02fe6bc6bb30bd8ac394af11533d791152d2 |
File details
Details for the file BiocFrame-0.3.20-py3-none-any.whl
.
File metadata
- Download URL: BiocFrame-0.3.20-py3-none-any.whl
- Upload date:
- Size: 18.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.18
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a7e2d769346810d87080b95bf0a6f37a720c4041e21d56172e95160505c2f96d |
|
MD5 | 98458bb7107b0a42557d577858f38781 |
|
BLAKE2b-256 | 2a80bdca041ac7844a67c68fd1262ab3f47b6057f83641c6bc5105694ba14f0d |