nucleic acid or protein sequence to png converter
Project description
fasta2png
This package includes two programs:
- fna2png: generates PNG images from nucleic acid (na) / nucleotide sequences in FASTA format representing different nucleic acids with different colors.
- faa2png: generates PNG images from amino acid (aa) / protein sequences in FASTA format representing different amino acids (codons) with different colors.
Both programs scan the sequence and generates a (small) rectangle (configurable size with --pixel-size) for each nucleotide bases or for each amino acids (codons) from top-left to bottom-right. The aspect ratio of the PNG is also configurable (with --aspect-ratio). The PNG image is in RGBA format.
For nucleotide sequences, A, C, G, T is painted using different colors (U is same as T), and all other codes (N and others) are painted with white. The background of the image (meaning the remaining area in the image) is painted with black. These colors are also configurable.
For protein sequences, each amino acid/codon is painted using a different color. The gap (-) is painted as same as background. Only the background color is configurable, because there are so many (27) codes.
Installation
pip install fasta2png
Usage: fna2png
fna2png --input <fna_input_in_fasta_format> --output <output_filename_of_png>
There are various options to customize PNG output, see help fna2png --help
for more info.
Usage: faa2png
faa2png --input <faa_input_in_fasta_format> --output <output_filename_of_png>
There are some options to customize PNG output, see help faa2png --help
for more info.
Example: fna2png
NC_045512.2 is the SARS-CoV-2 (corona virus 2) complete genome sequenced by Chinese researchers in January 2020.
NC_045512.2.fna file below is https://www.ncbi.nlm.nih.gov/nuccore/NC_045512.2?report=fasta&log$=seqview&format=text.
$ fna2png --input NC_045512.2.fna --output NC_045512.2.png --pixel-size 8 --aspect-ratio 3 2
seqdesc: NC_045512.2 Severe acute respiratory syndrome coronavirus 2 isolate Wuhan-Hu-1, complete genome
seqlen: 29903
Example: faa2png
This example is using the same SARS-CoV-2 sequence, but taking the sequence of the protein encoded by the first gene in its genome called ORF1ab.
YP_009724389.1.faa file below is https://www.ncbi.nlm.nih.gov/protein/YP_009724389.1?report=fasta&log$=seqview&format=text.
$ faa2png --input YP_009724389.1.faa --output YP_009724389.1.faa.png --pixel-size 4 --aspect-ratio 3 2
seqdesc: YP_009724389.1 orf1ab polyprotein [Severe acute respiratory syndrome coronavirus 2]
seqlen: 7096
Changes
- v3: Pillow updated to v8.0.1.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file fasta2png-4.tar.gz
.
File metadata
- Download URL: fasta2png-4.tar.gz
- Upload date:
- Size: 4.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.8.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3acba0cc731ef25a3c7674ffcd6063c9eabdba8b70c48c84d543a1517673a9ad |
|
MD5 | 049952e0d3e6b5499525aed3187dda12 |
|
BLAKE2b-256 | 33330984885fe4a3854265eee1db57c59613a6096931f60a57d582a405293719 |