Simple FASTA file tools to parse, edit, subset, split, and perform stats on FASTA files
Project description
FASTA file parsing written in C++ with Python bindings
Installation
pip install pyfastatools
Usage
The pyfastatools.Parser object is the primary API that parses FASTA files and yields pyfastatools.Record objects.
If you have a FASTA file called proteins.faa that looks like this:
>seq_1
MSKFKKIPL
>seq_2
MQSSSKTCN
>seq_3
MEDNMITIY
Then you can parse this file in python like this:
from pyfastatools import Parser
for record in Parser("proteins.faa"):
print(record.header.name, record.seq)
which will print:
>>> 'seq_1 MSKFKKIPL'
>>> 'seq_2 MQSSSKTCN'
>>> 'seq_3 MEDNMITIY'
API
This library has a very simple API that can be displayed in a few lines:
Parser
This is the main class that will satisfy 99% of user needs. While parsing FASTA files, it produces Record objects. Only the name of a FASTA file is needed:
pyfastatools.Parser("my_fasta.fasta")
The parser will attempt to auto-detect the RecordType of the file by checking the input file extension and the first 5 sequences.
However, the record type can optionally be specified:
pyfastatools.Parser("my_fasta.fasta", pyfastatools.RecordType.PROTEIN)
The parser can be iterated over to yield one Record at a time:
parser = pyfastatools.Parser("my_fasta.fasta")
for record in parser:
...
Methods
There are also other convenience methods:
all- Read all records into a list-like object.take- Take up to n records into a list-like object.filter- Keep/exclude sequences based on the sequence name.remove_stops- Yield sequences without a*stop codon character if the sequences are proteins.clean_header- Yield sequences while cleaning the header to not have a description.headers- YieldHeaderobjects only without parsing the sequence itself.all_headers- Return all headers into a list-like object.
Properties
num_records- Returns the number of sequences in the FASTA file. This is cached after the first time it is called. Note: This can also be computed usinglen(parser)format- Returns theRecordTypeenum that corresponds to the FASTA file's record typeextension- Returns the file extension based on theformat
Record
A single FASTA record. It has the following fields:
header- AHeaderobject that has the fieldsnameanddescseq- Astrstoring the entire sequence
Methods
empty- Checks if theHeaderand sequence are emptyclear- Sets theHeaderand sequence to empty stringsto_string- Returns the record as a string representation identical to what was parsed from the fileclean_header- Sets theHeaderdescription to an empty stringremove_stops- Removes*stop codon characters from the sequence if they are present
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyfastatools-2.3.0.tar.gz.
File metadata
- Download URL: pyfastatools-2.3.0.tar.gz
- Upload date:
- Size: 57.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
49a43599cf23e790b666ef2838ff558feefefe6af6b7e2aa8c12867b65fe579d
|
|
| MD5 |
228ac56b2f95aff66a32d90c8cc19852
|
|
| BLAKE2b-256 |
4cef71fe3aecc7b9418411d49f5375cbc6c731cc35603e8c1933741e7b86e03d
|
File details
Details for the file pyfastatools-2.3.0-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: pyfastatools-2.3.0-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 148.5 kB
- Tags: PyPy, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
44eab9490286b002eef7ce6387758cdd44c1d3b08d8dc060e7c175068d2311db
|
|
| MD5 |
c6a2119f183e38f3780a6d287b296511
|
|
| BLAKE2b-256 |
df4e8eb4cdd748c58fc1a117e0f8035116f0e2465447067bc5ffbdd8b4f8f44e
|
File details
Details for the file pyfastatools-2.3.0-pp310-pypy310_pp73-manylinux_2_17_i686.manylinux2014_i686.whl.
File metadata
- Download URL: pyfastatools-2.3.0-pp310-pypy310_pp73-manylinux_2_17_i686.manylinux2014_i686.whl
- Upload date:
- Size: 157.9 kB
- Tags: PyPy, manylinux: glibc 2.17+ i686
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f937b88dba61137deacd02f73e72927ce0d4bcbe2573487407f1ed3fb6c894ca
|
|
| MD5 |
8ce9606ec46ba1dfa3c712a42770a675
|
|
| BLAKE2b-256 |
1dac8f5d348d22fd7458888dc33e076d6a5e08783c3b2fc8c9ca0177882d51d8
|
File details
Details for the file pyfastatools-2.3.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: pyfastatools-2.3.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 148.4 kB
- Tags: PyPy, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2336c22fb0750c60cbcac88ba9c1e70a87a06db65c56c09215b3b9950ecf39c5
|
|
| MD5 |
537f207b22066cf69eb056056caab992
|
|
| BLAKE2b-256 |
85d01cbbaf99e21b265871867230bddaba04e6483a1ca222d43a1c8bcaef0e0f
|
File details
Details for the file pyfastatools-2.3.0-pp39-pypy39_pp73-manylinux_2_17_i686.manylinux2014_i686.whl.
File metadata
- Download URL: pyfastatools-2.3.0-pp39-pypy39_pp73-manylinux_2_17_i686.manylinux2014_i686.whl
- Upload date:
- Size: 157.9 kB
- Tags: PyPy, manylinux: glibc 2.17+ i686
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
54d327660262dffce638f97174b37d40378a1bf20d90548f8b05426f5f572e21
|
|
| MD5 |
acb648aa6f6e9465e4132119216999a4
|
|
| BLAKE2b-256 |
bdfd3fd9aa54c03f8924c78df3cf6541ef09e2365f729f52dfb5713281e774cc
|
File details
Details for the file pyfastatools-2.3.0-cp312-abi3-musllinux_1_2_x86_64.whl.
File metadata
- Download URL: pyfastatools-2.3.0-cp312-abi3-musllinux_1_2_x86_64.whl
- Upload date:
- Size: 608.7 kB
- Tags: CPython 3.12+, musllinux: musl 1.2+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
21dd2b9397698080fa5983930f6c457023747eaa1d037841618ba21908e7e755
|
|
| MD5 |
6f5ea8258701d8904ba4a8b8c439479d
|
|
| BLAKE2b-256 |
6dac875427f5351b277760c3ddf31d29df86f4562068441b1367a0553e3ab321
|
File details
Details for the file pyfastatools-2.3.0-cp312-abi3-musllinux_1_2_i686.whl.
File metadata
- Download URL: pyfastatools-2.3.0-cp312-abi3-musllinux_1_2_i686.whl
- Upload date:
- Size: 662.1 kB
- Tags: CPython 3.12+, musllinux: musl 1.2+ i686
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a9018e255763845954a55f99e0df546c02b5a718c52c114826d7e4476143d6b5
|
|
| MD5 |
efd23266952c5d412a01eaad2e50f4c6
|
|
| BLAKE2b-256 |
47fe34aff6ddd21d1240efb7ca323d87f284b8fe844e4c6aca872e7e6e53d1bf
|
File details
Details for the file pyfastatools-2.3.0-cp312-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: pyfastatools-2.3.0-cp312-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 152.1 kB
- Tags: CPython 3.12+, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bf4aaa18441aa7e8b0c5a5a6dea1f1f5a622506e204647608c4a57ce636085ec
|
|
| MD5 |
907af37cc5fb4ca772b142444edf05be
|
|
| BLAKE2b-256 |
db8b211d50a41d78f268db0f047c1c6855f0038ccdcf5e599d1531a1770c3e80
|
File details
Details for the file pyfastatools-2.3.0-cp312-abi3-manylinux_2_17_i686.manylinux2014_i686.whl.
File metadata
- Download URL: pyfastatools-2.3.0-cp312-abi3-manylinux_2_17_i686.manylinux2014_i686.whl
- Upload date:
- Size: 162.5 kB
- Tags: CPython 3.12+, manylinux: glibc 2.17+ i686
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5a29adaa2e0a5a1fd2449972131b576ca04e0f1f3977fc03c682b6acb41fea17
|
|
| MD5 |
0247b1231f1ae01beb34eacfb05b4027
|
|
| BLAKE2b-256 |
a56c7cc72fcd0404c80fca05659de2d34415fb7d9623ca627051719497bc5f24
|
File details
Details for the file pyfastatools-2.3.0-cp311-cp311-musllinux_1_2_x86_64.whl.
File metadata
- Download URL: pyfastatools-2.3.0-cp311-cp311-musllinux_1_2_x86_64.whl
- Upload date:
- Size: 609.4 kB
- Tags: CPython 3.11, musllinux: musl 1.2+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ef9612f891682fa733207c0bd2b04f72060cecf6d7e8aabf15591aecc9ebd06c
|
|
| MD5 |
4acc7b8227afbb5c4dfe3ec5a8938d4a
|
|
| BLAKE2b-256 |
f9ff0a8b9717aec23f46ec70b6c2bc180c2d7a02a2b2d97f215b240497d4439d
|
File details
Details for the file pyfastatools-2.3.0-cp311-cp311-musllinux_1_2_i686.whl.
File metadata
- Download URL: pyfastatools-2.3.0-cp311-cp311-musllinux_1_2_i686.whl
- Upload date:
- Size: 662.9 kB
- Tags: CPython 3.11, musllinux: musl 1.2+ i686
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ff36ee47c41c64e07054e7eaddc14b4cf0c2e6d3d4b23e63007db410dbbbec70
|
|
| MD5 |
a5483976564e2cf638f4378e320a0e2e
|
|
| BLAKE2b-256 |
b7ba5c0901a12f9c9a317187e66083d6976c5acd9c95e5ef37cfe9df9bfde76d
|
File details
Details for the file pyfastatools-2.3.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: pyfastatools-2.3.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 152.8 kB
- Tags: CPython 3.11, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f445c6a0a9a3df3b087fa746dabdfe88fd64c2ff8abada7a03e023f05b46877e
|
|
| MD5 |
f927cd1270d567dd247413ccd3e8618f
|
|
| BLAKE2b-256 |
cb7ca97137d11f44dbdff2b7570abf2d1cb9d2605f0f6b94b5487f42d085282f
|
File details
Details for the file pyfastatools-2.3.0-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl.
File metadata
- Download URL: pyfastatools-2.3.0-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl
- Upload date:
- Size: 162.6 kB
- Tags: CPython 3.11, manylinux: glibc 2.17+ i686
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bdd96fcce97ebed2eb287d505919f248b8331e5a1481c390ad0158d65fed0b06
|
|
| MD5 |
e05965105cf24db84f743160f70cb3cd
|
|
| BLAKE2b-256 |
764757e91cbb6b893b5089de75729f6f69f340ed6ca208dcf29b6fb2fc0dc9ac
|
File details
Details for the file pyfastatools-2.3.0-cp310-cp310-musllinux_1_2_x86_64.whl.
File metadata
- Download URL: pyfastatools-2.3.0-cp310-cp310-musllinux_1_2_x86_64.whl
- Upload date:
- Size: 608.9 kB
- Tags: CPython 3.10, musllinux: musl 1.2+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
afc673eb83740d3d9baf67ac856bb7d13b1afba366cd36046e0c7826e719b56c
|
|
| MD5 |
eacc9ee1b83862d9718c1ca075004e93
|
|
| BLAKE2b-256 |
5180853a70a1289dde6ceeb21418970bba501f98e848e57580c5b42c4f04cf5d
|
File details
Details for the file pyfastatools-2.3.0-cp310-cp310-musllinux_1_2_i686.whl.
File metadata
- Download URL: pyfastatools-2.3.0-cp310-cp310-musllinux_1_2_i686.whl
- Upload date:
- Size: 662.4 kB
- Tags: CPython 3.10, musllinux: musl 1.2+ i686
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d5f27cee946e2b5b602fdbf3171bedee8db0e8cdecbe2c8de7494d3affe6fa1d
|
|
| MD5 |
3207944c9283fb1ef7d217b6a16cda54
|
|
| BLAKE2b-256 |
0593cdaf4d3bfaa9227b224b0fa477e538023b00a5bb5d93dca7b82d69244d6c
|
File details
Details for the file pyfastatools-2.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: pyfastatools-2.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 152.5 kB
- Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
350729cd34ab2b11227e8ba6a72044d91b21b07b5af3ae1a938846d5cfec8912
|
|
| MD5 |
bccc24e6e20625a58bba8093319911a5
|
|
| BLAKE2b-256 |
7d2dbb86b2d5a55aa02ae23ded930028d148f759062084cc1922760b8d0bc287
|
File details
Details for the file pyfastatools-2.3.0-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl.
File metadata
- Download URL: pyfastatools-2.3.0-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl
- Upload date:
- Size: 162.2 kB
- Tags: CPython 3.10, manylinux: glibc 2.17+ i686
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4834a41a5e0af2babf044b9999c99d1d4ac1b0654df141f8e036547d8e388271
|
|
| MD5 |
aa54e636a50072e8f6119674697fafee
|
|
| BLAKE2b-256 |
fbe226faa017189b0119cca6093f2ff6db756965d34403a9fcf28ee798859c42
|
File details
Details for the file pyfastatools-2.3.0-cp39-cp39-musllinux_1_2_x86_64.whl.
File metadata
- Download URL: pyfastatools-2.3.0-cp39-cp39-musllinux_1_2_x86_64.whl
- Upload date:
- Size: 609.3 kB
- Tags: CPython 3.9, musllinux: musl 1.2+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
338ac1b3b8199f3cdedb7ef424275b960bbf364c035513ceff33c6f6492e4f18
|
|
| MD5 |
36d225c3d3eb58b86a77c3cee6680ace
|
|
| BLAKE2b-256 |
c5f0343ed5de5c5b896417d1a12b266adc202141e98568c24b88da8fb4011689
|
File details
Details for the file pyfastatools-2.3.0-cp39-cp39-musllinux_1_2_i686.whl.
File metadata
- Download URL: pyfastatools-2.3.0-cp39-cp39-musllinux_1_2_i686.whl
- Upload date:
- Size: 662.6 kB
- Tags: CPython 3.9, musllinux: musl 1.2+ i686
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
95d70bfd6e7c9f42a4f0722b7caffec0f0bd0686d701ebd662d03631ece81073
|
|
| MD5 |
90385957bdf0926c63da606eed2256e8
|
|
| BLAKE2b-256 |
585840b01b4a55da80b1ce4a8fb1771d3139831d84334a76190316f3c5f0273f
|
File details
Details for the file pyfastatools-2.3.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: pyfastatools-2.3.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 152.6 kB
- Tags: CPython 3.9, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bb4cefd1ed93763e7f5d843d79b2cfdc4bbe4f1c99b52464c8da456ad8508869
|
|
| MD5 |
887b9714686c79a1ba2a2afbe642149a
|
|
| BLAKE2b-256 |
ba230e43bc1178c4ca62045a18d91c651da9ff79692a6282f48b1181d5359565
|
File details
Details for the file pyfastatools-2.3.0-cp39-cp39-manylinux_2_17_i686.manylinux2014_i686.whl.
File metadata
- Download URL: pyfastatools-2.3.0-cp39-cp39-manylinux_2_17_i686.manylinux2014_i686.whl
- Upload date:
- Size: 162.5 kB
- Tags: CPython 3.9, manylinux: glibc 2.17+ i686
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
84d073f1cb5925705bdec718afed1b03b7b3fc122a220cc8cf2f483d89aacd77
|
|
| MD5 |
697e05e28272192aeafcf2a94e175a10
|
|
| BLAKE2b-256 |
71128f2987c2d325e2d07c5b231d0064789a71af391e28da7949bd97f5af9659
|