Skip to main content

a command-line tool for downloading genome data from UCSC ftp site.

Project description

pygenomes (formally named pygenomes) is a Python module and also a command- line tool for downloading genome data from UCSC ftp site.

Using this tool you can easily check available taxa or available groups of taxa by simply typing the name of the tool or add -g flag with a group name.

Download genome or chromosome data with this tool also very simple. You can download data by using commom name, scientific name, or even specific assembly name (e.g. human, Homo sapiens, hg38).

If there is available data in the ftp site, you also can downlaod data via interactive mode, which will allow you to choose specific genome assembly.To do this, you only need to assign a True value for assembly, then according to the available list prompted to you simply input the interested assembly.

The tool only tested on Python2.7, may not work on python3.

Features

  • Install a Python module named pygenomes.py

  • Install a Python command-line script named pygenomes

  • Easily check available taxa and genomes in commend-line

  • Simply download genomes via comman, scientific, or assembly name

  • Using interactive mode, simply download specific assembly genome

  • Easily download genome and chromosome in specific file format

  • Pure Python module and without any third-party dependence

Examples

You can use pygenomes as a Python module e.g. like in the following interactive Python session (the function’s signature might still change a bit in the future):

>>> from pygenomes import genomes
>>>
# prints out available taxa in group of mammals
>>> genomes(group='mammals')
# download human reference genome (hg19) in .2bit format
>>> genomes(taxa='human')
# interactive mode, ask for inputing and downloading specific assembly
>>> genomes(taxa='cow', assembly='1')
# try to download chromosome data, per fa.gz file per chromosome
>>> genomes(taxa='cow', chrs='1')

In addition there is a script named pygenomes, which can be used more easily from the system command-line like this (you can see many more examples when typing pygenomes -h on the command-line):

$ pygenomes -g -1                 # prints out all available taxa

$ pygenomes -g mammal             # prints out all available taxa in group
                                    of mammal
$ pygenomes -t yeast              # download yeast reference genome in
                                    .2bit format
$ pygenomes -t yeast -a 1         # interactive mode, ask for inputing and
                                    downloading specific assembly
$ pygenomes -t cow -f fa.gz       # try to download cow reference genome
                                    in fa.gz format
$ pygenomes -t dog -a 1 -o /tmp   # interactive mode, ask for inputing,
                                    file will be stored in /tmp
$ pygenomes -t cat -c 1           # download cat chromosome data, per
                                    chromosome per fa.gz file

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pygenomes-0.1.0.tar.gz (12.8 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page