a Python library and command line tool to make GEO data into gold.
Project description
geo-alchemy
a Python library and command line tool to make GEO data into gold.
- why geo-alchemy
- installation
- usage 3.1 parse metadata from GEO 3.1.1 platform 3.1.2 sample 3.1.3 series 3.2 serialization and deserialization
why geo-alchemy
GEO is like a gold mine that contains a huge many gold ore. But processing these gold ore(GEO series) into gold(expression matrix, clinical data) is not very easy:
- how to map microarray probe to gene?
- how about multiple probes map to same gene?
- hot to get clinical data?
- ...
geo-alchemy was born to deal with it.
installation
pip install geo-alchemy
usage
parse metadata from GEO
platform
from geo_alchemy import PlatformParser
parser = PlatformParser.from_accession('GPL570')
platform1 = parser.parse()
# or
platform2 = PlatformParser.from_accession('GPL570').parse()
print(platform1 == platform2)
sample
from geo_alchemy import SampleParser
parser = SampleParser.from_accession('GSM1885279')
sample1 = parser.parse()
# or
sample2 = SampleParser.from_accession('GSM1885279').parse()
print(sample1 == sample2)
series
from geo_alchemy import SeriesParser
parser = SeriesParser.from_accession('GSE73091')
series1 = parser.parse()
# don't parse samples, samples attribute will be a blank list
series2 = parser.parse(parse_samples=False)
print(series2.samples == [])
# or
series3 = SeriesParser.from_accession('GSE73091').parse()
print(series1 == series3)
additional computed attributes can be access by:
print(series.sample_count) # how many samples
print(series.platforms) # duplication removal platforms
print(series.organisms) # duplication removal organisms
serialization and deserialization
For the convenience of saving, all objects in geo-alchemy can be converted to dict, and this dict can be directly saved to a file in json form.
Moreover, geo-alchemy also provides methods to convert these dicts into objects.
from geo_alchemy import SeriesParser
series1 = SeriesParser.from_accession('GSE73091').parse()
data = series1.to_dict()
series2 = SeriesParser.parse_dict(data)
print(series1 == series2)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
geo-alchemy-0.0.3.tar.gz
(9.9 kB
view hashes)
Built Distributions
geo_alchemy-0.0.3-py3.7.egg
(19.8 kB
view hashes)
Close
Hashes for geo_alchemy-0.0.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0f69de3e2c7412a1a66000b435268a73970d2a5398b539ed6305d3986f5d79d8 |
|
MD5 | b35f0ffdf26f291d97c001e59d4f08db |
|
BLAKE2b-256 | 551ce3cf3423c6641413d1f2196b89a6f1d3cb9ba6c5df06e2b9b0b67d1886b8 |