Skip to main content

a Python library and command line tool to make GEO data into gold.

Project description

geo-alchemy

a Python library and command line tool to make GEO data into gold.

  1. why geo-alchemy
  2. installation
  3. usage

why geo-alchemy

GEO is like a gold mine that contains a huge many gold ore. But processing these gold ore(GEO series) into gold(expression matrix, clinical data) is not very easy:

  1. how to map microarray probe to gene?
  2. how about multiple probes map to same gene?
  3. hot to get clinical data?
  4. ...

geo-alchemy was born to deal with it.

installation

pip install geo-alchemy

usage

parse metadata from GEO

parse platform

from geo_alchemy import PlatformParser


parser = PlatformParser.from_accession('GPL570')
platform1 = parser.parse()


# or
platform2 = PlatformParser.from_accession('GPL570').parse()


print(platform1 == platform2)

parse sample

from geo_alchemy import SampleParser


parser = SampleParser.from_accession('GSM1885279')
sample1 = parser.parse()

# or
sample2 = SampleParser.from_accession('GSM1885279').parse()

print(sample1 == sample2)

parse series

from geo_alchemy import SeriesParser


parser = SeriesParser.from_accession('GSE73091')
series1 = parser.parse()


# don't parse samples, samples attribute will be a blank list
series2 = parser.parse(parse_samples=False)
print(series2.samples == [])


# or
series3 = SeriesParser.from_accession('GSE73091').parse()


print(series1 == series3)

additional computed attributes can be access by:

print(series.sample_count)  # how many samples
print(series.platforms)  # duplication removal platforms
print(series.organisms)  # duplication removal organisms

serialization and deserialization

For the convenience of saving, all objects in geo-alchemy can be converted to dict, and this dict can be directly saved to a file in json form.

Moreover, geo-alchemy also provides methods to convert these dicts into objects.

from geo_alchemy import SeriesParser


series1 = SeriesParser.from_accession('GSE73091').parse()
data = series1.to_dict()
series2 = SeriesParser.parse_dict(data)


print(series1 == series2)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

geo-alchemy-0.0.6.tar.gz (10.3 kB view hashes)

Uploaded Source

Built Distributions

geo_alchemy-0.0.6-py3.7.egg (21.1 kB view hashes)

Uploaded Source

geo_alchemy-0.0.6-py3-none-any.whl (10.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page