Sanitise protein FASTA files / data
Project description
tidyfasta
A python program to tidy and sanitise FASTA sequence files
Problems and fixes
Problem | Fix |
---|---|
Sequence without ID | ID name added |
ID without sequence | Exception raised |
Multiline sequence | One line per sequence |
Non canonical AA | Exception raise |
Dangerous characters in ID | Exception raise |
Lowercase AA | Converts to uppercase AA |
Excessive Whitespace | Removes excessive whitespace |
Usage
tidyfasta.py --input file.FASTA
tidyfasta.py --input file.FASTA --single
tidyfasta.py --input file.FASTA --single --strict
Output
- Tidied version of original file
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
tidyfasta-1.0.0.tar.gz
(3.4 kB
view hashes)
Built Distribution
Close
Hashes for tidyfasta-1.0.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 65247bca13fc0253e86165970bdd2deb385a47cbc9c4c38293028d4f3209eca2 |
|
MD5 | 450b83800375481cad2bbdff0bc62674 |
|
BLAKE2b-256 | 51a1ce8f0a26cbcae6bf1b0f2702ef75fabb2d6fe2f4ecfc62b83f26097a6bd2 |