Interface code to interact with data from the Ovara.net biobank.

These details have not been verified by PyPI

Project links

Project description

marburg_biobank

Introduction

The marburg_biobank python module offers a high level interface to the data sets stored in the [Ovarian Cancer Effusion Biobank and Database])(https://www.ovara.net/biobank).

The basic usage is as follows:

import marburg_biobank
db = marburg_biobank.OvcaBiobank("marburg_ovca_revision_15.zip") #  you need to download that file from your biobank.
print(db.list_datasets())
df_wide = db.get_wide('transcriptomics/rnaseq')  # to retrieve the data in a one sample per column / one row per measured variable format
df_tall = db.get_dataset('transcriptomics/rnaseq') # to retrieve the data in one row per data point format

Data formats available

wide

Using db.get_wide(dataset):

A pandas DataFrame that looks like this

Index	Patient12, TAM	Patient12, TU	PatientX, Compartment
VariableA, unitA	23.23	112.2	nan
VariableB, unitB	3.23	12.2	12.7

Caveats: If a dataset has only one compartment, the compartment information is ommited by get_wide(), unless .get_wide(standardized=True) is used. The same applies for the unit in the index. If there is a 'name' column in dataset, it get's added to the index, regardless of the value of standardized.

tall

Using: db.get_dataset(dataset)):

A pandas DataFrame that looks like this

variable	unit	patient	compartment	value
variableA	unitA	Patient12	TAM	23.23
variableA	unitA	Patient12	TU	112.2
variableB	unitB	Patient13	TAM	3.23
variableB	unitB	Patient13	TU	12.2

This is the internal storage format.

compartments

Compartments are an abstraction on top of 'cells' and 'bio-liquid'. Examples are Tumor associated macrophages (TAMs), Tumor cells (TU), ascites, blood... db.get_compartments() provides a list

Datasets

Datasets are organized three levels deep. The first one defines the whether you're looking t ex-vivo (=primary) data or in-vitro experiments (=secondary) or literature data (=tertiary). The second level defines *omics being measured (transcriptomics, proteomics, ... or 'clinical'), while the third levels defines the actual method (RNaseq, FACS,...)

Survival data is in primary/clinical/survival.

Please remember: if using https://pypi.python.org/pypi/lifelines, censored and event are negations of each other.

Excluded patients:

Exclusion can either be on a patient, or a patient+compartment level. In addition, there is per dataset exclusion and global exclusion.

Exclusion is by default applied to db.get_wide(), but not to db.get_dataset(), you can change the default by passing apply_exclusion=True|False.

Exclusion information can be retrieved by db.get_excluded_patients(dataset), which return a set of patients (or patient+compartment tuples), or db.get_exclusion_reasons(), which lists why the exclusion happend.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.156

Feb 23, 2022

0.155

Sep 14, 2021

0.154

May 7, 2021

0.153

May 7, 2021

0.152

May 7, 2021

0.151

Apr 21, 2021

0.150

Apr 21, 2021

0.149

Apr 14, 2021

0.148

Apr 14, 2021

0.147

Jan 29, 2021

0.146

Jan 29, 2021

0.145

Jan 29, 2021

0.144

Jan 29, 2021

0.143

Jan 29, 2021

0.142

Oct 29, 2020

0.141

Oct 28, 2020

0.140

Sep 1, 2020

0.139

Jun 9, 2020

0.138

Jun 9, 2020

0.137

Apr 28, 2020

0.135

Apr 28, 2020

0.134

Apr 22, 2020

0.133

Apr 22, 2020

0.132

Apr 22, 2020

0.131

Mar 19, 2020

0.130

Dec 9, 2019

0.129

Nov 20, 2019

0.128

Nov 20, 2019

0.127

Nov 15, 2019

0.124

Aug 27, 2019

0.122

Aug 26, 2019

0.121

May 29, 2019

0.120

May 29, 2019

0.117

May 3, 2019

0.116

May 3, 2019

0.115

Apr 11, 2018

0.114

Apr 11, 2018

0.113

Jan 9, 2018

0.112

Jan 2, 2018

0.111

Jan 2, 2018

0.109

Jan 2, 2018

0.108

Jan 2, 2018

0.107

Jan 2, 2018

0.106

Jan 2, 2018

0.105

Jan 2, 2018

0.104

Oct 9, 2017

0.103

Sep 12, 2017

0.102

Sep 12, 2017

0.101

Sep 12, 2017

0.11

Jan 2, 2018

0.1

Sep 12, 2017

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

marburg_biobank-0.156-py2.py3-none-any.whl (52.8 kB view details)

Uploaded Feb 23, 2022 Python 2Python 3

File details

Details for the file marburg_biobank-0.156-py2.py3-none-any.whl.

File metadata

Download URL: marburg_biobank-0.156-py2.py3-none-any.whl
Upload date: Feb 23, 2022
Size: 52.8 kB
Tags: Python 2, Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.50.1 CPython/3.8.10

File hashes

Hashes for marburg_biobank-0.156-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`b664df7170aff0955e7584b6fec4c2d18a9308044918fb87e60faf210d9584d2`
MD5	`f3cc9865e35c1b59184fb1d1827ae451`
BLAKE2b-256	`d1a4031415d44b48973922402e454d5f46e099e7f3116cd1379e2ca891b452e7`

See more details on using hashes here.

marburg-biobank 0.156

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

marburg_biobank

Introduction

Data formats available

wide

tall

compartments

Datasets

Excluded patients:

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes