Skip to main content

A package used to retrieve exon for protein sequences from RefSeqGene database

Project description

CODX

codx is a Python package that allows retrieval of exon data from the NCBI RefSeqGene database.

Installation

pip install codx

Usage Python Package

The package uses gene IDs to retrieve exon data from the NCBI RefSeqGene database. Gene IDs can be obtained from the UniProt database using the accession ID of the gene. The get_geneids_from_uniprot function can be used to obtain the gene ID from the RefSeqGene database of NCBI.

Example Usage

Retrieve Gene IDs from UniProt

from codx.components import get_geneids_from_uniprot

# Example UniProt accession IDs
accession_ids = ["P35568", "P05019", "Q99490", "Q8NEJ0", "Q13322", "Q15323"]
gene_ids = get_geneids_from_uniprot(accession_ids)
print(gene_ids)  # Output: Set of gene IDs

Create a Database and Retrieve Gene Data

from codx.components import create_db

# Create a database with gene and exon data from NCBI
db = create_db(["120892"], entrez_email="your@email.com")  # Provide an email address for NCBI API

# Retrieve a gene object using its gene name
gene = db.get_gene("LRRK2")

# Retrieve exon data from the gene object
for exon in gene.blocks:
    print(exon.start, exon.end, exon.sequence)

# Generate all possible ordered combinations of exons
for exon_combination in gene.shuffle_blocks():
    print(exon_combination)

Six-Frame Translation of Sequences

from codx.components import three_frame_translation

# Generate six-frame translation of exon combinations
for exon_combination in gene.shuffle_blocks():
    three_frame = three_frame_translation(exon_combination.seq, only_start_with_codons=["ATG"])
    three_frame_complement = three_frame_translation(exon_combination.seq, only_start_with_codons=["ATG"], reverse=True)
    print(three_frame)
    print(three_frame_complement)

Usage Command Line

In addition to the Python API, the package provides a CLI interface for the same purpose.

CLI Usage

Usage: codx [OPTIONS] IDS

Options:
  -o, --output TEXT              Output file
  -i, --include-intron           Include intron
  -u, --uniprot                  Input is UniProt accession IDs
  -t, --translate                Translate to protein
  -3, --three-frame-translation  Translate to protein in 3 frames
  -6, --six-frame-translation    Translate to protein in 6 frames (3 forward and 3 reverse complement)
  --help                         Show this message and exit.

Example CLI Usage

Retrieve data using UniProt accession IDs:

codx -o output.fasta -u P35568,P05019,Q99490,Q8NEJ0,Q13322,Q15323

Retrieve data using gene IDs:

codx -o output.fasta 1190,120892

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

codx-0.1.4.tar.gz (8.8 kB view details)

Uploaded Source

Built Distribution

codx-0.1.4-py3-none-any.whl (10.2 kB view details)

Uploaded Python 3

File details

Details for the file codx-0.1.4.tar.gz.

File metadata

  • Download URL: codx-0.1.4.tar.gz
  • Upload date:
  • Size: 8.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.10.11 Windows/10

File hashes

Hashes for codx-0.1.4.tar.gz
Algorithm Hash digest
SHA256 166652b54e57db052225a960df2fb3909452f676e6e4cd816a1da7a942d812c9
MD5 c7fe8e2ad1ad614af74c99529a9be5bf
BLAKE2b-256 3f62bb3d4ab3279184a5d2c5855020031cbb58acd01734c4da9e75dccb12ea97

See more details on using hashes here.

File details

Details for the file codx-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: codx-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 10.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.10.11 Windows/10

File hashes

Hashes for codx-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 077b9b4af895e49ba7c4d99b2fab608864e06c8cd501f8e6b11c9925b25a888b
MD5 505e55dfcfe624ec5e7af8da30f2c18c
BLAKE2b-256 32d826c8e61ddac1be4445aa1ee0456265b0690dfd5be824ddd4076db0c273b1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page