Package to load genes from GENCODE GTF files
Project description
GENCODEGenes
This package loads genes from GENCODE GTF/GFF files, groups transcripts by gene, and provides methods for transcripts, so you can find exon coordinates, CDS distances and sequences.
Install
pip install gencodegenes
Usage
from gencodegenes import Gencode
gencode = Gencode(GTF_PATH)
# full function arguments are Gencode(gtf_path, fasta_path=None, coding_only=True)
# - fasta_path: pass in path to fasta file to get gene transcripts with sequence
# - coding_only: pass in False to include all transcripts, not just protein coding
# get gene by HGNC symbol
gene = gencode['OR5A1']
transcripts = gene.transcripts
canonical = gene.canonical # picks MANE transcript if available, if none named
# as MANE, picks the one tagged as appris_principal
# (or longest CDS if multiple), if none tagged, picks
# the longest protein coding, if none protein coding,
# picks the longest cDNA
gene.start, gene.end, gene.chrom, gene.strand, gene.symbol # other attributes available
# find gene nearest a genomic position, or overlapping a genomic region
gencode.nearest('chr1', 1000000)
gencode.in_region('chr1', 1000000, 2000000)
# and the transcript has a bunch of methods
tx = gene.canonical
tx.in_exons(pos) # check if pos in exons
tx.in_coding_region(pos) # check if pos in CDS
tx.get_coding_distance(pos) # get distance in CDS to CDS start
tx.get_closest_exon(pos) # find exon closest to position
tx.get_position_on_chrom(cds_pos) # convert CDS pos to genomic pos
tx.get_codon_info(pos) # get info about codon for a site
tx.get_codon_number_for_cds_pos(cds_pos) # convert CDS pos to codon number
tx.translate(seq) # translate DNA to AA (if opened with Fasta)
# the transcript also has associated data fields
tx.name # transcript ID
tx.chrom # transcript chromosome
tx.start # transcript start (TSS)
tx.end # transcript end
tx.cds_start # CDS start position
tx.cds_end # CDS end position
tx.type # transcript type e.g. protein_coding
tx.strand # strand (+ or -)
tx.exons # list of exon coordinates
tx.cds # list of CDS coordinates
tx.cds_sequence # get cDNA sequence (if Gencode was opened with fasta)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
gencodegenes-1.1.2.tar.gz
(313.9 kB
view hashes)
Built Distributions
Close
Hashes for gencodegenes-1.1.2-cp312-cp312-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c437ce0a41466f2a9cc973fc60914fedbd3c7bd6205afb06bc4637490ffab883 |
|
MD5 | 8e4b25e0cca237e0981def630b8d3554 |
|
BLAKE2b-256 | 3b378845a61169f4a4fd3c96d069b98b214986c963eb22ae24ace507c6ae5a7b |
Close
Hashes for gencodegenes-1.1.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4177e45953ff5a62a23dd2815f0c7baf943eddcd518c85fed0b3f20d2c9c3754 |
|
MD5 | 787ed8cdc9a3580a1aed6abdb7c0cbea |
|
BLAKE2b-256 | 0a72155ad95fa32b80daa07295ff783e6de18f549b3c53b145c2b546a5f7f7fc |
Close
Hashes for gencodegenes-1.1.2-cp312-cp312-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5c79876dfff8b33110cf0e4e701a9379762597100a808e40eb04abca7ec66c96 |
|
MD5 | 4006ab6dc9ebf4f6512d11a43b9d6286 |
|
BLAKE2b-256 | 5eb6b4f4363a868fdaf9309ec885cfafd57c5c945ae49e13ef897c97a17dd248 |
Close
Hashes for gencodegenes-1.1.2-cp311-cp311-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e9e95986c17e71804fd1bf2e9b67b09c2eac98acadbac39df54e28a45d3f98bb |
|
MD5 | d8083f4bd67a91eaa13076757bc77eb8 |
|
BLAKE2b-256 | 7073834d665447f29584bc145d10423a6ae068d65b09faf444bab745663a42f8 |
Close
Hashes for gencodegenes-1.1.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c4d5996bf197cc1e045b2a89da90d18fa3dce4d38a0eb023e66faca81b4ab135 |
|
MD5 | 3355bcd06b5a211e0f86620da938039c |
|
BLAKE2b-256 | 6633d204267feab901755e533e2c141ac4c5114e0889beb72181a387f0ae805f |
Close
Hashes for gencodegenes-1.1.2-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e755bdd8f91da27bd7e68fa104cfe25a388266c0615d51603bbc6fd1fc8c4b86 |
|
MD5 | 62d84f4d6cd3a4d3952bd7cc39d322fe |
|
BLAKE2b-256 | 64ea3d12a639e016bada52ddf9b9c7706372f1ab998b01c37eadec7987bbc75d |
Close
Hashes for gencodegenes-1.1.2-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 52a8ff4259af4ac841e7097b4ba8c386dc6c673a9b9e5fafd1237a710190c95a |
|
MD5 | 579c5a7a9f9c4feda372a9257475864e |
|
BLAKE2b-256 | 95d7bf67d79120c6b3cf6e4211000e30871d2956c07fbcec71e7bd344471150c |
Close
Hashes for gencodegenes-1.1.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | efd1a25f5c72211927795c99967ec59ee76230af68ab8a463a4ee9bdd3f3ef00 |
|
MD5 | f14e98e2f75ee0b5c7d71ca9525e952d |
|
BLAKE2b-256 | 4a090beb7df5dd8997962808c356aa86110065fc6698ce7757bb44d24837965d |
Close
Hashes for gencodegenes-1.1.2-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 92e33102351def8e597582fdea4ffc64ae6bf85d4a9229f52682fd4b77d835a5 |
|
MD5 | 1b2a13b75caf27cd0f630def4703ae59 |
|
BLAKE2b-256 | 676ffec462120418aaf60f0f977721d2ce9de851aa0a259d7cbbc6a3b9564f76 |
Close
Hashes for gencodegenes-1.1.2-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 699b35bb6dfcd573fe78f95cbbfd2691670057ba4a76a261b19a15977a9a4ae1 |
|
MD5 | 42bf6f9225c776c7dcec9735f81a2ba2 |
|
BLAKE2b-256 | d9584872ce67677fd1bca5c3f97c00e1d0f93b5821aa2ebfe9b37f03cff1ff51 |
Close
Hashes for gencodegenes-1.1.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5f8b751b3cf0520e9c9c86a2e62cefbd5af788a61256531bf32a9360a132d83b |
|
MD5 | cb1aa4949eab63354f57a320bab2961e |
|
BLAKE2b-256 | c65aa39ef5ab7c99303b70c9ef12987841a50ec348d76a36c2a0c53937a6dd84 |
Close
Hashes for gencodegenes-1.1.2-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 497e18be12fb4325c0510f09b1d175de208c200972c856def8380803951cca55 |
|
MD5 | 2ce93deaea2bac9085ed5e3efb4d79a1 |
|
BLAKE2b-256 | 6fbacdb60c96458ebe3e40953d3723799a11bee32f965ab634f3d54f64599ab1 |
Close
Hashes for gencodegenes-1.1.2-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0ff472ae199c73992bc089f948356a92ce44aa6286cea4b7ddcc9477f99a8078 |
|
MD5 | 077a5e7ae9437fbc4fa8f7375f302ec9 |
|
BLAKE2b-256 | 59549e764050e48a7b48fa69d4671ff384376b9f356e7606fb5a9e33d8c72cad |
Close
Hashes for gencodegenes-1.1.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 418e1202dc236d890439e079f6326f911e52834b3dc7d1cdc80e5c2b21b67eb5 |
|
MD5 | df9f19b2b1b138df2e4effc02b6e0e59 |
|
BLAKE2b-256 | 076754994bd003043f8dd6b4ea02916a228755b9e7630c8cc238231f7619403e |
Close
Hashes for gencodegenes-1.1.2-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 97f00cdb69bfe4ee32103615ad85b2997f96ba108183346cf1a311b0a229abd1 |
|
MD5 | 5caea69593e39a8f9c2bba29db5af29e |
|
BLAKE2b-256 | 6e2de4d63a9317868d47a740ed8cd62ea4c3c0e4082ce16936e546909e9822d3 |