Skip to main content

Making Gene Expression Omnibus data cleansing easy.

Project description

* GEOpurify
/Atlas of tools making Gene Expression Omnibus data amicable to machine learning./

** Installation

pip install GEOpurify

** Example Usage

#+BEGIN_SRC python :results output org drawer
from GEOpurify import GEOpurifier
g = GEOpurifier()
gds_df = g.gdspurify("GDS4376")

** Methods

*** ~filepurify(filepath, separation="\t")~

Given a path to a standard table with GEO data, returns a dataframe
with gene expression and GSM ids in separate columns.

*** ~dirpurify(dirname)~

Given a path to a directory with standard tables of data from GEO,
applies ~filepurify~ and return a combined dataframe.

*** ~gdspurify(gds_id, load_extra_features=False)~

Given a GDS id, extracts data on a platform, platform organism,
platform techonolgy type and sample organism used. If
~load_extra_features~ is set to ~True~, extra features are fetched
from the GDS columns.

Saves already processed tables corresponding to the GDS in the
directory ~data/tmp~, while storing the raw GEO data in the directory

*** ~gdspolypurify(self, gds_list_path, load_extra_features=False)~

Given a path to a file listing GDS ids, each on a new line, applies
~gdspurify~ to each and combines all the data into one dataframe.

Project details

Release history Release notifications

This version
History Node


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
GEOpurify-0.1-py2.py3-none-any.whl (5.4 kB) Copy SHA256 hash SHA256 Wheel py2.py3
GEOpurify-0.1.tar.gz (4.3 kB) Copy SHA256 hash SHA256 Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page