Skip to main content

Package to load genes from GENCODE GTF files

Project description

GENCODEGenes

This package loads genes from GENCODE GTF/GFF files, groups transcripts by gene, and provides methods for transcripts, so you can find exon coordinates, CDS distances and sequences.

Install

pip install gencodegenes

Usage

from gencodegenes import Gencode

gencode = Gencode(GTF_PATH)
# full function arguments are Gencode(gtf_path, fasta_path=None, coding_only=True)
#  - fasta_path: pass in path to fasta file to get gene transcripts with sequence
#  - coding_only: pass in False to include all transcripts, not just protein coding

# get gene by HGNC symbol
gene = gencode['OR5A1']
transcripts = gene.transcripts
canonical = gene.canonical  # picks MANE transcript if available, if none named
                            # as MANE, picks the one tagged as appris_principal
                            # (or longest CDS if multiple), if none tagged, picks
                            # the longest protein coding, if none protein coding,
                            # picks the longest cDNA 
gene.start, gene.end, gene.chrom, gene.strand, gene.symbol # other attributes available


# find gene nearest a genomic position, or overlapping a genomic region
gencode.nearest('chr1', 1000000)
gencode.in_region('chr1', 1000000, 2000000)

# and the transcript has a bunch of methods
tx = gene.canonical
tx.in_exons(pos)                         # check if pos in exons
tx.in_coding_region(pos)                 # check if pos in CDS
tx.get_coding_distance(pos)              # get distance in CDS to CDS start
tx.get_closest_exon(pos)                 # find exon closest to position
tx.get_position_on_chrom(cds_pos)        # convert CDS pos to genomic pos
tx.get_codon_info(pos)                   # get info about codon for a site
tx.get_codon_number_for_cds_pos(cds_pos) # convert CDS pos to codon number
tx.translate(seq)                        # translate DNA to AA (if opened with Fasta)

# the transcript also has associated data fields
tx.name         # transcript ID
tx.chrom        # transcript chromosome
tx.start        # transcript start (TSS)
tx.end          # transcript end
tx.cds_start    # CDS start position
tx.cds_end      # CDS end position 
tx.type         # transcript type e.g. protein_coding
tx.strand       # strand (+ or -)
tx.exons        # list of exon coordinates
tx.cds          # list of CDS coordinates
tx.cds_sequence # get cDNA sequence (if Gencode was opened with fasta)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gencodegenes-1.1.4.tar.gz (314.1 kB view details)

Uploaded Source

Built Distributions

gencodegenes-1.1.4-cp312-cp312-win_amd64.whl (548.8 kB view details)

Uploaded CPython 3.12 Windows x86-64

gencodegenes-1.1.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB view details)

Uploaded CPython 3.12 manylinux: glibc 2.17+ x86-64

gencodegenes-1.1.4-cp312-cp312-macosx_11_0_arm64.whl (552.2 kB view details)

Uploaded CPython 3.12 macOS 11.0+ ARM64

gencodegenes-1.1.4-cp311-cp311-win_amd64.whl (548.5 kB view details)

Uploaded CPython 3.11 Windows x86-64

gencodegenes-1.1.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

gencodegenes-1.1.4-cp311-cp311-macosx_11_0_arm64.whl (551.8 kB view details)

Uploaded CPython 3.11 macOS 11.0+ ARM64

gencodegenes-1.1.4-cp310-cp310-win_amd64.whl (547.8 kB view details)

Uploaded CPython 3.10 Windows x86-64

gencodegenes-1.1.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

gencodegenes-1.1.4-cp310-cp310-macosx_11_0_arm64.whl (551.5 kB view details)

Uploaded CPython 3.10 macOS 11.0+ ARM64

gencodegenes-1.1.4-cp39-cp39-win_amd64.whl (548.0 kB view details)

Uploaded CPython 3.9 Windows x86-64

gencodegenes-1.1.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

gencodegenes-1.1.4-cp39-cp39-macosx_11_0_arm64.whl (552.2 kB view details)

Uploaded CPython 3.9 macOS 11.0+ ARM64

gencodegenes-1.1.4-cp38-cp38-win_amd64.whl (548.4 kB view details)

Uploaded CPython 3.8 Windows x86-64

gencodegenes-1.1.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

gencodegenes-1.1.4-cp38-cp38-macosx_11_0_arm64.whl (552.8 kB view details)

Uploaded CPython 3.8 macOS 11.0+ ARM64

File details

Details for the file gencodegenes-1.1.4.tar.gz.

File metadata

  • Download URL: gencodegenes-1.1.4.tar.gz
  • Upload date:
  • Size: 314.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for gencodegenes-1.1.4.tar.gz
Algorithm Hash digest
SHA256 a5096260bedff7fd703642c11e77ba0f37edbfa71e3b4f57ba6d5b98175ba582
MD5 67a37a208649b624b10e0dfa15887607
BLAKE2b-256 425cd626484c824dadc2d6c1185a633bbce4f362fb62e5dd15da989e5d613d30

See more details on using hashes here.

File details

Details for the file gencodegenes-1.1.4-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for gencodegenes-1.1.4-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 082ae9410c76bffeace1d80b8a3d3f395a2889d6224f8bd5c345967efd02bfc5
MD5 9529469f0c19053ff628ec73a44c6dce
BLAKE2b-256 52343f051e7bcddffe801b51e911ea9f58d064ee7f32c68ca23b7b6619b17a74

See more details on using hashes here.

File details

Details for the file gencodegenes-1.1.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gencodegenes-1.1.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 5787762e9f9192b5e927a346917922f19666f1b42f455fa24009500a9c5ef53b
MD5 af028785f63debc490abc1366a661906
BLAKE2b-256 c491b4e7f1d8c5bcba4938bba36c14d0736375cf0968566e81be4973904d9ac6

See more details on using hashes here.

File details

Details for the file gencodegenes-1.1.4-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for gencodegenes-1.1.4-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d52ec516ecad724e25935de1d8aeb535801c23ddfe364e902d34dca2efa16bc5
MD5 91807c0f7d9b6ad9cfcfe8b2a5843af5
BLAKE2b-256 09872ff5159066904c6e613ecd9e4f024be4caa8fd13da09c800b3c6b7b43e9e

See more details on using hashes here.

File details

Details for the file gencodegenes-1.1.4-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for gencodegenes-1.1.4-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 523d5df52a9e28b111227538b6b06a631c2a8781c8f2db9a8f3a2737385556d7
MD5 3c13e2119625e9fa44e12887194f4f36
BLAKE2b-256 e4b65da6f638546b397370b6255924c2921d5ccba8347079507ae94955ccebe8

See more details on using hashes here.

File details

Details for the file gencodegenes-1.1.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gencodegenes-1.1.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 1824c61b58783a0461d384e16beedb8dcd80f477135d3e039712a41c9b9c3eb9
MD5 380af460e9ca2c2a353bc7565c023f76
BLAKE2b-256 635a1bdb83c350864a7b29e7a96d7eb27fcc8d595ca24bb498951f46e67f6d45

See more details on using hashes here.

File details

Details for the file gencodegenes-1.1.4-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for gencodegenes-1.1.4-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 7217e6b064f5163711f96b2ca1bb07c0eba38676664d7402aedb74236162e02b
MD5 fa002cd9fea22bd6c42b16ad0b8d3195
BLAKE2b-256 0a6802899476850fa52e5e6bfeb2112953290d6f7df8286eaa8ec37fff5986cf

See more details on using hashes here.

File details

Details for the file gencodegenes-1.1.4-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for gencodegenes-1.1.4-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 310d6310b3ee67df75ced41c82e29e5471d6f6002c0a2070749786969c353931
MD5 f3026745165971470be802229facb54f
BLAKE2b-256 3a627685242a117f6024c95de0954edce36638f65550925dd98d664471e82adf

See more details on using hashes here.

File details

Details for the file gencodegenes-1.1.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gencodegenes-1.1.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 627bc416afe6f351e38843f979b69f9b71cce3df7a04a7c13d2acef7c4589038
MD5 85b9a8ef18a88ad16af08dcab6607c56
BLAKE2b-256 24c101d512b835adfbc5c7514c121ca6495534f7488d9d1e5f4ecf2c5801d4c0

See more details on using hashes here.

File details

Details for the file gencodegenes-1.1.4-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for gencodegenes-1.1.4-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 bfa44bca41d90ae9592c069e4280587098e292f8f61eae6b1350a8712623b073
MD5 c36f54837e43a6fe44d1a7647581cb94
BLAKE2b-256 00f8b396813fc1d33488c5bdfd003329feae17289f9a2ecfecc5c8eb5aeaa56c

See more details on using hashes here.

File details

Details for the file gencodegenes-1.1.4-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for gencodegenes-1.1.4-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 4695b3aa1e74b6d619b1007805e1a8f4bf31e8191decec825dd5030d057be5f1
MD5 637083966bb5334754c10cd6f91074c7
BLAKE2b-256 b5dd7a2c72cb74678bc2c148dcfbb5718ca7b447d34d7c0146c9df0dd5397772

See more details on using hashes here.

File details

Details for the file gencodegenes-1.1.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gencodegenes-1.1.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 96f0ede95707b9144814cfcd459323bf49936d30a0fae9f316b015ee44566eae
MD5 73fb2f350d1a7aa33be7b32ecb8d46d6
BLAKE2b-256 110982963c3b0c6904811b635a20a9e118cd7b287859e78dc5c2623a94fabb71

See more details on using hashes here.

File details

Details for the file gencodegenes-1.1.4-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for gencodegenes-1.1.4-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 22c975fbad11bc5ab7e482cd450579e4f7a33e7f8c3ec6f30297a07636276f70
MD5 6ab5e15cf75b78143ef9d32b67e44c05
BLAKE2b-256 8a3f64570af7614acc7888f7d557bdf87c150fb0d5c68ba5880951ad5caaf045

See more details on using hashes here.

File details

Details for the file gencodegenes-1.1.4-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for gencodegenes-1.1.4-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 bfad7fc4c37361ea2c87b562db401b003543f0de8ee1bc9b8c5b9593200b9ad9
MD5 75bf41bb6a0e72da8c74470377823829
BLAKE2b-256 b35e9bf1055e6e957c70f6aeb53423bd17a5c4c6f25b40e7eee62a3768cfc8c6

See more details on using hashes here.

File details

Details for the file gencodegenes-1.1.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gencodegenes-1.1.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9701985a27eb19d0fe28250c4ef3cb39622825b1d4bae89840b98011a7af64a4
MD5 7715bda58bf55ce2b995dac9224c2108
BLAKE2b-256 d3acf03bd544051572296278cfcd98b98c546a7c6b91431aa3a2be4fc960d81d

See more details on using hashes here.

File details

Details for the file gencodegenes-1.1.4-cp38-cp38-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for gencodegenes-1.1.4-cp38-cp38-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 f7bbc50ed4670c3e1c211651623b0ccba873d77b931fe054bda725ef257ee694
MD5 ea308e2f6b31b135be3a8ef54dee9015
BLAKE2b-256 d615b5d55987662b8803d287fbc8a6a38cf6ec6074c170f932cd8f01410d2749

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page