decode a one-hot numpy array to biological sequences
Project description
Description
onehot2seq
is a command-line tool decoding a one-hot numpy array to DNA/RNA/protein sequences.
To encode sequences to a one-hot numpy array, use seq2onehot
.
https://github.com/akikuno/seq2onehot
Installation
You can install onehot2seq
using pip:
pip install onehot2seq
Usage
onehot2seq [options] -t/--type <dna/rna/protein> -i/--input <in.npy> -o/--output <out.txt/fasta>
Options
-a/--ambiguous: include ambiguous characters
-f/--format <txt/fasta>: output as a FASTA format (default: txt)
The ambigous characters are:
XBZJ
for amino acidNVHDBMRWSYK
for DNA and RNA
The detail of ambiguous characters is described here:
https://meme-suite.org/meme/doc/alphabets.html
The header IDs of FASTA format are sequential numbers (e.g. >seq1
, >seq2
)
Examples
# Output DNA sequences
onehot2seq -t dna -i example/dna.npy -o dna.txt
# Output DNA sequences as a FASTA format
onehot2seq -t dna -f fasta -i example/dna.npy -o dna.fasta
# RNA sequences
onehot2seq -t rna -i example/rna.npy -o rna.txt
# Protein sequences
onehot2seq -t protein -i example/protein.npy -o protein.txt
One-hot array
The input file must contain 3d one-hot array of RxNxL
(Read x Nucreotide/Amino acid x Letter)
- The order of nucreotide is
ACGT
(+NVHDBMRWSYK
) for DNA,ACGU
(+NVHDBMRWSYK
) for RNA - The order of amino acid is
ACDEFGHIKLMNPQRSTVWY
(+XBZJ
)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file onehot2seq-0.0.2.tar.gz
.
File metadata
- Download URL: onehot2seq-0.0.2.tar.gz
- Upload date:
- Size: 3.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/0.0.0 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1c370f5c0afd4f6d4eef614cd13fdab230c676eb914605c07e2a21bbeae19573 |
|
MD5 | c8dc4e091ceff4a095f320145f77c3a8 |
|
BLAKE2b-256 | f44ad4cd466eda3a2b25f22a3ec323f9a6329625bb85d5351a3a81a7343b472d |
File details
Details for the file onehot2seq-0.0.2-py3-none-any.whl
.
File metadata
- Download URL: onehot2seq-0.0.2-py3-none-any.whl
- Upload date:
- Size: 4.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/0.0.0 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 38154c5e45fa57ffc3690ea02f0c17a9b733fc737460c4b80a72f85e23f38826 |
|
MD5 | a8fb6bccae973f479cc27aa0b8cafc8e |
|
BLAKE2b-256 | dc4ba15bf10da408cc6f1fe16a912e0b49f30d52c4c42ba1c32c74280d7f183f |