Making Gene Expression Omnibus data cleansing easy.
Project description
* GEOpurify
/Atlas of tools making Gene Expression Omnibus data amicable to machine learning./
** Installation
#+BEGIN_SRC sh
pip install GEOpurify
#+END_SRC
** Example Usage
#+BEGIN_SRC python :results output org drawer
from GEOpurify import GEOpurifier
g = GEOpurifier()
gds_df = g.gdspurify("GDS4376")
#+END_SRC
** Methods
*** ~filepurify(filepath, separation="\t")~
Given a path to a standard table with GEO data, returns a dataframe
with gene expression and GSM ids in separate columns.
*** ~dirpurify(dirname)~
Given a path to a directory with standard tables of data from GEO,
applies ~filepurify~ and return a combined dataframe.
*** ~gdspurify(gds_id, load_extra_features=False)~
Given a GDS id, extracts data on a platform, platform organism,
platform techonolgy type and sample organism used. If
~load_extra_features~ is set to ~True~, extra features are fetched
from the GDS columns.
Saves already processed tables corresponding to the GDS in the
directory ~data/tmp~, while storing the raw GEO data in the directory
~data/raw~.
*** ~gdspolypurify(self, gds_list_path, load_extra_features=False)~
Given a path to a file listing GDS ids, each on a new line, applies
~gdspurify~ to each and combines all the data into one dataframe.
/Atlas of tools making Gene Expression Omnibus data amicable to machine learning./
** Installation
#+BEGIN_SRC sh
pip install GEOpurify
#+END_SRC
** Example Usage
#+BEGIN_SRC python :results output org drawer
from GEOpurify import GEOpurifier
g = GEOpurifier()
gds_df = g.gdspurify("GDS4376")
#+END_SRC
** Methods
*** ~filepurify(filepath, separation="\t")~
Given a path to a standard table with GEO data, returns a dataframe
with gene expression and GSM ids in separate columns.
*** ~dirpurify(dirname)~
Given a path to a directory with standard tables of data from GEO,
applies ~filepurify~ and return a combined dataframe.
*** ~gdspurify(gds_id, load_extra_features=False)~
Given a GDS id, extracts data on a platform, platform organism,
platform techonolgy type and sample organism used. If
~load_extra_features~ is set to ~True~, extra features are fetched
from the GDS columns.
Saves already processed tables corresponding to the GDS in the
directory ~data/tmp~, while storing the raw GEO data in the directory
~data/raw~.
*** ~gdspolypurify(self, gds_list_path, load_extra_features=False)~
Given a path to a file listing GDS ids, each on a new line, applies
~gdspurify~ to each and combines all the data into one dataframe.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
GEOpurify-0.1.tar.gz
(4.3 kB
view hashes)
Built Distribution
Close
Hashes for GEOpurify-0.1-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9149d60794effd4e423bbff5d48137b584bb599db9a9523839eb566f512ddd3a |
|
MD5 | d04de71dcad7a67a6cb2068d82a0bc47 |
|
BLAKE2b-256 | 51e7b0915646d147197e1f7baac2943b8c940a1af345296ddac65bb38b065df9 |