Sanitise protein FASTA files / data
Project description
tidyfasta
A python program to tidy and sanitise FASTA sequence files
Problems and fixes
Problem | Fix |
---|---|
Sequence without ID | ID name added |
ID without sequence | Exception raised |
Multiline sequence | One line per sequence |
Non canonical AA | Exception raise |
Dangerous characters in ID | Exception raise |
Lowercase AA | Converts to uppercase AA |
Excessive Whitespace | Removes excessive whitespace |
Usage
tidyfasta.py --input file.FASTA
tidyfasta.py --input file.FASTA --single
tidyfasta.py --input file.FASTA --single --strict
Output
- Tidied version of original file
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
tidyfasta-1.0.1.tar.gz
(3.4 kB
view hashes)
Built Distribution
Close
Hashes for tidyfasta-1.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c8bf636e2f3974b4adb0d64dd148e564b4af79928b462a35e91f5a62a8dc71da |
|
MD5 | 23b2309baa1d2f621173018236c1f9eb |
|
BLAKE2b-256 | 37e907537370680d99061c6fa84702829599d66494fc1648d58794133f36b467 |