Skip to main content

Downloads data and metadata from GEO and SRA and creates standard PEPs.

Project description

geofetch logo

PEP compatible Run pytests docs-badge pypi-badge Code style: black install with bioconda

geofetch is a command-line tool that downloads sequencing data and metadata from GEO and SRA and create metadata tables in standard PEP format. geofetch is hosted at pypi. You can convert the result of geofetch into unmapped bam or fastq files with the included sraconvert command.

Key geofetch features:

  • Works with GEO and SRA metadata
  • Combines samples from different projects
  • Standardizes output metadata
  • Filters type and size of processed files (from GEO) before downloading them
  • Easy to use
  • Fast execution time
  • Can search GEO to find relevant data
  • Can be used either as a command-line tool or from within Python using an API

Docs


Documentation: https://pep.databio.org/geofetch/

Source Code: https://github.com/pepkit/geofetch/


Installation

To install geofetch use this command:

pip install geofetch

or install the latest version from the GitHub repository:

pip install git+https://github.com/pepkit/geofetch.git

All GEO projects (GSE + GSM) in PEP format.

All GEO projects are available in PEPhub under geo namespace: https://pephub.databio.org/geo/ . User can search for GEO projects using the search bar, or download archive with all GEO PEPs from archive section of the namespace: https://pephub.databio.org/geo?view=archive

How to cite:

https://doi.org/10.1093/bioinformatics/btad069

@article{10.1093/bioinformatics/btad069,
    author = {Khoroshevskyi, Oleksandr and LeRoy, Nathan and Reuter, Vincent P and Sheffield, Nathan C},
    title = "{GEOfetch: a command-line tool for downloading data and standardized metadata from GEO and SRA}",
    journal = {Bioinformatics},
    volume = {39},
    number = {3},
    pages = {btad069},
    year = {2023},
    month = {03},
    abstract = "{The Gene Expression Omnibus has become an important source of biological data for secondary analysis. However, there is no simple, programmatic way to download data and metadata from Gene Expression Omnibus (GEO) in a standardized annotation format.To address this, we present GEOfetch—a command-line tool that downloads and organizes data and metadata from GEO and SRA. GEOfetch formats the downloaded metadata as a Portable Encapsulated Project, providing universal format for the reanalysis of public data.GEOfetch is available on Bioconda and the Python Package Index (PyPI).}",
    issn = {1367-4811},
    doi = {10.1093/bioinformatics/btad069},
    url = {https://doi.org/10.1093/bioinformatics/btad069},
    eprint = {https://academic.oup.com/bioinformatics/article-pdf/39/3/btad069/49407404/btad069.pdf},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

geofetch-0.12.11.tar.gz (37.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

geofetch-0.12.11-py3-none-any.whl (42.9 kB view details)

Uploaded Python 3

File details

Details for the file geofetch-0.12.11.tar.gz.

File metadata

  • Download URL: geofetch-0.12.11.tar.gz
  • Upload date:
  • Size: 37.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for geofetch-0.12.11.tar.gz
Algorithm Hash digest
SHA256 7a32b15ae87d654e33969333334ce8f0070f6b3663d1461a70b40fb7d7bd0422
MD5 ea4bf4086f127f8a956c4f89a503837b
BLAKE2b-256 492426676d0d01be39ba5127c6c4e6001c0a7be38602a9e5b53e1ce500fbe17c

See more details on using hashes here.

Provenance

The following attestation bundles were made for geofetch-0.12.11.tar.gz:

Publisher: python-publish.yml on pepkit/geofetch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file geofetch-0.12.11-py3-none-any.whl.

File metadata

  • Download URL: geofetch-0.12.11-py3-none-any.whl
  • Upload date:
  • Size: 42.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for geofetch-0.12.11-py3-none-any.whl
Algorithm Hash digest
SHA256 934fbf806af3062e97f75957b5d06cb980583354868f41edf60a8da3f3d455ce
MD5 f111b182d23580de06e8ded976497e86
BLAKE2b-256 7cd3129768e3f9a2487b01426d44b6849529f278429a7c2ad3af942257f2ab75

See more details on using hashes here.

Provenance

The following attestation bundles were made for geofetch-0.12.11-py3-none-any.whl:

Publisher: python-publish.yml on pepkit/geofetch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page