Skip to main content

a Python library and command line tool to make GEO data into gold.

Project description

geo-alchemy

a Python library and command line tool to make GEO data into gold.

  1. why geo-alchemy
  2. installation
  3. use as Python library
  4. use as command line software

why geo-alchemy

GEO is like a gold mine that contains a huge many gold ore. But processing these gold ore(GEO series) into gold(expression matrix, clinical data) is not very easy:

  1. how to map microarray probe to gene?
  2. how about multiple probes map to same gene?
  3. hot to get clinical data?
  4. ...

geo-alchemy was born to deal with it.

installation

If you only want use as Python library:

pip install geo-alchemy

If you also want use as command line software:

pip install 'geo-alchemy[cmd]'

use as Python library

parse metadata from GEO

parse platform

from geo_alchemy import PlatformParser


parser = PlatformParser.from_accession('GPL570')
platform1 = parser.parse()


# or
platform2 = PlatformParser.from_accession('GPL570').parse()


print(platform1 == platform2)

# get platform annotation data
platform = PlatformParser.from_accession('GPL570', view='full').parse()
print(platform.internal_data)

parse sample

from geo_alchemy import SampleParser


parser = SampleParser.from_accession('GSM1885279')
sample1 = parser.parse()

# or
sample2 = SampleParser.from_accession('GSM1885279').parse()

print(sample1 == sample2)

parse series

from geo_alchemy import SeriesParser


parser = SeriesParser.from_accession('GSE73091')
series1 = parser.parse()

# or
series2 = SeriesParser.from_accession('GSE73091').parse()


print(series1 == series2)
print(series1.platforms)
print(series1.samples)
print(series1.organisms)

serialization and deserialization

For the convenience of saving, all objects in geo-alchemy can be converted to dict, and this dict can be directly saved to a file in json form.

Moreover, geo-alchemy also provides methods to convert these dicts into objects.

from geo_alchemy import SeriesParser


series1 = SeriesParser.from_accession('GSE73091').parse()
data = series1.to_dict()
series2 = SeriesParser.parse_dict(data)


print(series1 == series2)

use as command line software

preprocessing(microarray series only)

geo-alchemy pp -s GSE174772 -p GPL570 -g 11
  1. -s GSE174772 means preprocessing for GSE174772
  2. -p GPL570 means preprocessing samples who use GPL570 of GSE174772
  3. -g 11 means NO.11 column of GPL570 annotation file is gene

this command generate 2 files under current directory:

  1. clinical file GSE174772_clinical.txt
  2. gene expression file GSE174772_expression.txt

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

geo-alchemy-0.0.8.tar.gz (16.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

geo_alchemy-0.0.8-py3.7.egg (37.4 kB view details)

Uploaded Egg

geo_alchemy-0.0.8-py3-none-any.whl (16.8 kB view details)

Uploaded Python 3

File details

Details for the file geo-alchemy-0.0.8.tar.gz.

File metadata

  • Download URL: geo-alchemy-0.0.8.tar.gz
  • Upload date:
  • Size: 16.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.7.10

File hashes

Hashes for geo-alchemy-0.0.8.tar.gz
Algorithm Hash digest
SHA256 fe70deffd3d3fdf397c0422aa61e1b7ce6d63e8684701d917a6b12967d5929db
MD5 6dbcd6e26fb7fa980febde0ace4b7e76
BLAKE2b-256 e5e89cb855eb734c4215629f9c3f20506c409d1e67bfd74d0ca20c517ce4b049

See more details on using hashes here.

File details

Details for the file geo_alchemy-0.0.8-py3.7.egg.

File metadata

  • Download URL: geo_alchemy-0.0.8-py3.7.egg
  • Upload date:
  • Size: 37.4 kB
  • Tags: Egg
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.7.10

File hashes

Hashes for geo_alchemy-0.0.8-py3.7.egg
Algorithm Hash digest
SHA256 8cdbd1b82570a7b5a631f2115ed52371243ce76be770b49d9f15d576565a53e8
MD5 0a293704656eeb8ee4dd9c89f98e5894
BLAKE2b-256 930a5901a6545e5ab4a0a71e2ed270d7fbc469658aa10dbe1057a73c71bfd4d0

See more details on using hashes here.

File details

Details for the file geo_alchemy-0.0.8-py3-none-any.whl.

File metadata

  • Download URL: geo_alchemy-0.0.8-py3-none-any.whl
  • Upload date:
  • Size: 16.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.7.10

File hashes

Hashes for geo_alchemy-0.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 17e60bebcafe2610f4e2dfe883e7da6c82f0fef31859683c04119c37f31b9f2f
MD5 93ac48ce85ce65bf8dcfaabfe6ace51f
BLAKE2b-256 9ca1874bba4a62a8a1b0c1ae34fe2c8a2b9242667c000c8fb12dcefe94a84878

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page