Genomes in Python
Project description
Easily install and use genomes in Python and elsewhere!
Installation
Via pip, for now.
$ pip install genomepy
Usage
>>> import genomepy
>>> for row in genomepy.search("human"):
... print "\t".join(row)
...
UCSC hg38 Human Dec. 2013 (GRCh38/hg38) Genome at UCSC
UCSC hg19 Human Feb. 2009 (GRCh37/hg19) Genome at UCSC
UCSC hg18 Human Mar. 2006 (NCBI36/hg18) Genome at UCSC
UCSC hg17 Human May 2004 (NCBI35/hg17) Genome at UCSC
UCSC hg16 Human July 2003 (NCBI34/hg16) Genome at UCSC
Ensembl bacteria_102_collection_core_34_87_1 Brucella melitensis (GCA_000988815)
Ensembl bacteria_94_collection_core_34_87_1 Brucella suis (GCA_000875695)
Ensembl bacteria_131_collection_core_34_87_1 Candidatus Paraburkholderia schumannianae
Ensembl homo_sapiens_core_86_38 Human
Ensembl pediculus_humanus_core_34_87_2 Pediculus humanus
>>> genomepy.install_genome("hg38", "UCSC", "/data/genomes")
downloading...
done...
name: hg38
fasta: /data/genomes/hg38/hg38.fa
>>> g = genomepy.genome("hg38", "/data/genomes")
>>> g["chr6"][166502000:166503000]
tgtatggtccctagaggggccagagtcacagagatggaaagtggatggcgggtgccgggggctggggagctactgtgcagggggacagagctttagttctgcaagatgaaacagttctggagatggacggtggggatgggggcccagcaatgggaacgtgcttaatgccactgaactgggcacttaaacgtggtgaaaactgtaaaagtcatgtgtatttttctacaattaaaaaaaATCTGCCACAGAGTTAAAAAAATAACCACTATTTTCTGGAAATGGGAAGGAAAAGTTACAGCATGTAATTAAGATGACAATTTATAATGAACAAGGCAAATCTTTTCATCTTTGCCTTTTGGGCATATTCAATCTTTGCCCAGAATTAAGCACCTTTCAAGATTAATTCTCTAATAATTCTAGTTGAACAACACAACCTTTTCCTTCAAGCTTGCAATTAAATAAGGCTATTTTTAGCTGTAAGGATCACGCTGACCTTCAGGAGCAATGAGAACCGGCACTCCCGGCCTGAGTGGATGCACGGGGAGTGTGTCTAACACACAGGCGTCAACAGCCAGGGCCGCACGAGGAGGAGGAGTGGCAACGTCCACACAGACTCACAACACGGCACTCCGACTTGGAGGGTAATTAATACCAGGTTAACTTCTGGGATGACCTTGGCAACGACCCAAGGTGACAGGCCAGGCTCTGCAATCACCTCCCAATTAAGGAGAGGCGAAAGGGGACTCCCAGGGCTCAGAGCACCACGGGGTTCTAGGTCAGACCCACTTTGAAATGGAAATCTGGCCTTGTGCTGCTGCTCTTGTGGGGAGACAGCAGCTGCGGAGGCTGCTCTCTTCATGGGATTACTCTGGATAAAGTCTTTTTTGATTCTACgttgagcatcccttatctgaaatgcctgaaaccggaagtgtttaggatttggggattttgcaatatttacttatatataatgagatatcttggagatgggccacaa
The genomepy.genome() method returns a pyfaidx.Fasta object, see the documentation for more examples on how to use this.
Command line
$ genomepy Usage: genomepy [OPTIONS] COMMAND [ARGS]... Options: --help Show this message and exit. Commands: genomes list available genomes install install genome providers list available providers search search for genomes
List available genomes.
$ genomepy genomes -p UCSC UCSC hg38 Human Dec. 2013 (GRCh38/hg38) Genome at UCSC UCSC hg19 Human Feb. 2009 (GRCh37/hg19) Genome at UCSC UCSC hg18 Human Mar. 2006 (NCBI36/hg18) Genome at UCSC ... UCSC danRer4 Zebrafish Mar. 2006 (Zv6/danRer4) Genome at UCSC UCSC danRer3 Zebrafish May 2005 (Zv5/danRer3) Genome at UCSC
Install a genome.
$ genomepy install hg38 UCSC /data/genomes/ downloading... done... name: hg38 fasta: /data/genomes/hg38/hg38.fa
Todo
Tests!
Ensembl bacteria
Automatic indexing (such as bwa)
Caching of UCSC/Ensembl genome listings
Configurable default genome installation directory
License
This module is licensed under the terms of the MIT license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
genomepy-0.0.1.tar.gz
(6.0 kB
view details)
File details
Details for the file genomepy-0.0.1.tar.gz
.
File metadata
- Download URL: genomepy-0.0.1.tar.gz
- Upload date:
- Size: 6.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9383ef4d2d7b7c008bf4150674c6058efa434235e4c08f2f1b201b888e424389 |
|
MD5 | a2f0dfefb1288aa201b23c7d4e2b0945 |
|
BLAKE2b-256 | c049e4aa0b64eda667bbd4c6fb3d0b6c6fb0d4a0ba03ceda3a4409a60f76d41b |