Skip to main content

Making Gene Expression Omnibus data cleansing easy.

Project description

* GEOpurify
/Atlas of tools making Gene Expression Omnibus data amicable to machine learning./

** Installation

#+BEGIN_SRC sh
pip install GEOpurify
#+END_SRC

** Example Usage

#+BEGIN_SRC python :results output org drawer
from GEOpurify import GEOpurifier
g = GEOpurifier()
gds_df = g.gdspurify("GDS4376")
#+END_SRC

** Methods

*** ~filepurify(filepath, separation="\t")~

Given a path to a standard table with GEO data, returns a dataframe
with gene expression and GSM ids in separate columns.

*** ~dirpurify(dirname)~

Given a path to a directory with standard tables of data from GEO,
applies ~filepurify~ and return a combined dataframe.

*** ~gdspurify(gds_id, load_extra_features=False)~

Given a GDS id, extracts data on a platform, platform organism,
platform techonolgy type and sample organism used. If
~load_extra_features~ is set to ~True~, extra features are fetched
from the GDS columns.

Saves already processed tables corresponding to the GDS in the
directory ~data/tmp~, while storing the raw GEO data in the directory
~data/raw~.

*** ~gdspolypurify(self, gds_list_path, load_extra_features=False)~

Given a path to a file listing GDS ids, each on a new line, applies
~gdspurify~ to each and combines all the data into one dataframe.


Project details


Release history Release notifications | RSS feed

This version

0.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

GEOpurify-0.1.tar.gz (4.3 kB view hashes)

Uploaded Source

Built Distribution

GEOpurify-0.1-py2.py3-none-any.whl (5.4 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page