decode a one-hot numpy array to biological sequences
Project description
Description
onehot2seq is a command-line tool decoding a one-hot numpy array to DNA/RNA/protein sequences.
To encode sequences to a one-hot numpy array, use seq2onehot.
https://github.com/akikuno/seq2onehot
Installation
You can install onehot2seq using pip:
pip install onehot2seq
Usage
onehot2seq [options] -t/--type <dna/rna/protein> -i/--input <in.npy> -o/--output <out.txt/fasta>
Options
-a/--ambiguous: include ambiguous characters
-f/--format <txt/fasta>: output as a FASTA format (default: txt)
The ambigous characters are:
XBZJfor amino acidNVHDBMRWSYKfor DNA and RNA
The detail of ambiguous characters is described here:
https://meme-suite.org/meme/doc/alphabets.html
The header IDs of FASTA format are sequential numbers (e.g. >seq1, >seq2)
Examples
# Output DNA sequences
onehot2seq -t dna -i example/dna.npy -o dna.txt
# Output DNA sequences as a FASTA format
onehot2seq -t dna -f fasta -i example/dna.npy -o dna.fasta
# RNA sequences
onehot2seq -t rna -i example/rna.npy -o rna.txt
# Protein sequences
onehot2seq -t protein -i example/protein.npy -o protein.txt
One-hot array
The input file must contain 3d one-hot array of RxNxL (Read x Nucreotide/Amino acid x Letter)
- The order of nucreotide is
ACGT(+NVHDBMRWSYK) for DNA,ACGU(+NVHDBMRWSYK) for RNA - The order of amino acid is
ACDEFGHIKLMNPQRSTVWY(+XBZJ)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file onehot2seq-0.0.2.tar.gz.
File metadata
- Download URL: onehot2seq-0.0.2.tar.gz
- Upload date:
- Size: 3.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/0.0.0 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.8.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1c370f5c0afd4f6d4eef614cd13fdab230c676eb914605c07e2a21bbeae19573
|
|
| MD5 |
c8dc4e091ceff4a095f320145f77c3a8
|
|
| BLAKE2b-256 |
f44ad4cd466eda3a2b25f22a3ec323f9a6329625bb85d5351a3a81a7343b472d
|
File details
Details for the file onehot2seq-0.0.2-py3-none-any.whl.
File metadata
- Download URL: onehot2seq-0.0.2-py3-none-any.whl
- Upload date:
- Size: 4.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/0.0.0 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.8.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
38154c5e45fa57ffc3690ea02f0c17a9b733fc737460c4b80a72f85e23f38826
|
|
| MD5 |
a8fb6bccae973f479cc27aa0b8cafc8e
|
|
| BLAKE2b-256 |
dc4ba15bf10da408cc6f1fe16a912e0b49f30d52c4c42ba1c32c74280d7f183f
|