Skip to main content

Simple FASTA file tools to parse, edit, subset, split, and perform stats on FASTA files

Project description

FASTA file parsing written in C++ with Python bindings

Installation

pip install pyfastatools

Usage

The pyfastatools.Parser object is the primary API that parses FASTA files and yields pyfastatools.Record objects.

If you have a FASTA file called proteins.faa that looks like this:

>seq_1
MSKFKKIPL
>seq_2
MQSSSKTCN
>seq_3
MEDNMITIY

Then you can parse this file in python like this:

from pyfastatools import Parser

for record in Parser("proteins.faa"):
    print(record.header.name, record.seq)

which will print:

>>> 'seq_1 MSKFKKIPL'
>>> 'seq_2 MQSSSKTCN'
>>> 'seq_3 MEDNMITIY'

API

This library has a very simple API that can be displayed in a few lines:

Parser

This is the main class that will satisfy 99% of user needs. While parsing FASTA files, it produces Record objects. Only the name of a FASTA file is needed:

pyfastatools.Parser("my_fasta.fasta")

The parser will attempt to auto-detect the RecordType of the file by checking the input file extension and the first 5 sequences.

However, the record type can optionally be specified:

pyfastatools.Parser("my_fasta.fasta", pyfastatools.RecordType.PROTEIN)

The parser can be iterated over to yield one Record at a time:

parser = pyfastatools.Parser("my_fasta.fasta")
for record in parser:
    ...

Methods

There are also other convenience methods:

  • all - Read all records into a list-like object.
  • take - Take up to n records into a list-like object.
  • filter - Keep/exclude sequences based on the sequence name.
  • remove_stops - Yield sequences without a * stop codon character if the sequences are proteins.
  • clean_header - Yield sequences while cleaning the header to not have a description.
  • headers - Yield Header objects only without parsing the sequence itself.
  • all_headers - Return all headers into a list-like object.

Properties

  • num_records - Returns the number of sequences in the FASTA file. This is cached after the first time it is called. Note: This can also be computed using len(parser)
  • format - Returns the RecordType enum that corresponds to the FASTA file's record type
  • extension - Returns the file extension based on the format

Record

A single FASTA record. It has the following fields:

  • header - A Header object that has the fields name and desc
  • seq - A str storing the entire sequence

Methods

  • empty - Checks if the Header and sequence are empty
  • clear - Sets the Header and sequence to empty strings
  • to_string - Returns the record as a string representation identical to what was parsed from the file
  • clean_header - Sets the Header description to an empty string
  • remove_stops - Removes * stop codon characters from the sequence if they are present

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyfastatools-2.5.0.tar.gz (62.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pyfastatools-2.5.0-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (166.7 kB view details)

Uploaded PyPymanylinux: glibc 2.17+ x86-64

pyfastatools-2.5.0-pp310-pypy310_pp73-manylinux_2_17_i686.manylinux2014_i686.whl (178.6 kB view details)

Uploaded PyPymanylinux: glibc 2.17+ i686

pyfastatools-2.5.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (166.7 kB view details)

Uploaded PyPymanylinux: glibc 2.17+ x86-64

pyfastatools-2.5.0-pp39-pypy39_pp73-manylinux_2_17_i686.manylinux2014_i686.whl (178.6 kB view details)

Uploaded PyPymanylinux: glibc 2.17+ i686

pyfastatools-2.5.0-cp312-abi3-musllinux_1_2_x86_64.whl (628.6 kB view details)

Uploaded CPython 3.12+musllinux: musl 1.2+ x86-64

pyfastatools-2.5.0-cp312-abi3-musllinux_1_2_i686.whl (683.5 kB view details)

Uploaded CPython 3.12+musllinux: musl 1.2+ i686

pyfastatools-2.5.0-cp312-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (169.6 kB view details)

Uploaded CPython 3.12+manylinux: glibc 2.17+ x86-64

pyfastatools-2.5.0-cp312-abi3-manylinux_2_17_i686.manylinux2014_i686.whl (182.2 kB view details)

Uploaded CPython 3.12+manylinux: glibc 2.17+ i686

pyfastatools-2.5.0-cp311-cp311-musllinux_1_2_x86_64.whl (629.1 kB view details)

Uploaded CPython 3.11musllinux: musl 1.2+ x86-64

pyfastatools-2.5.0-cp311-cp311-musllinux_1_2_i686.whl (684.7 kB view details)

Uploaded CPython 3.11musllinux: musl 1.2+ i686

pyfastatools-2.5.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (171.1 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

pyfastatools-2.5.0-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl (183.0 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ i686

pyfastatools-2.5.0-cp310-cp310-musllinux_1_2_x86_64.whl (628.9 kB view details)

Uploaded CPython 3.10musllinux: musl 1.2+ x86-64

pyfastatools-2.5.0-cp310-cp310-musllinux_1_2_i686.whl (684.0 kB view details)

Uploaded CPython 3.10musllinux: musl 1.2+ i686

pyfastatools-2.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (170.8 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

pyfastatools-2.5.0-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl (182.6 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ i686

pyfastatools-2.5.0-cp39-cp39-musllinux_1_2_x86_64.whl (629.1 kB view details)

Uploaded CPython 3.9musllinux: musl 1.2+ x86-64

pyfastatools-2.5.0-cp39-cp39-musllinux_1_2_i686.whl (684.3 kB view details)

Uploaded CPython 3.9musllinux: musl 1.2+ i686

pyfastatools-2.5.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (171.0 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

pyfastatools-2.5.0-cp39-cp39-manylinux_2_17_i686.manylinux2014_i686.whl (182.8 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ i686

File details

Details for the file pyfastatools-2.5.0.tar.gz.

File metadata

  • Download URL: pyfastatools-2.5.0.tar.gz
  • Upload date:
  • Size: 62.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.19

File hashes

Hashes for pyfastatools-2.5.0.tar.gz
Algorithm Hash digest
SHA256 b9421963b45ac4daa24d3fca25fef5e8b4f11fcdc9d194ebd372a3590ee21a40
MD5 ddd67e8407846c01c93416629ad8ad04
BLAKE2b-256 0176266632d468f35b6bae1e68f114bd3345dd5b847a15b780215c44524184af

See more details on using hashes here.

File details

Details for the file pyfastatools-2.5.0-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pyfastatools-2.5.0-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 35ce69e9009cdb51920c0c9ec2d1cd9dab3c1fa569cb5ba44ebc96f47b8be975
MD5 de2f77257f179e3e8c459de446866259
BLAKE2b-256 6e709a6d34d4e756f0bba1562ee96ff49ba50f0f90bee454f0d5e63693fd3ef3

See more details on using hashes here.

File details

Details for the file pyfastatools-2.5.0-pp310-pypy310_pp73-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for pyfastatools-2.5.0-pp310-pypy310_pp73-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 5277cef87e0d0bd24b9bcbaa1215072d0aa442e7c3638ebbb37eb7bf5392ca07
MD5 32bea70879cdba8ad6971c53f089e149
BLAKE2b-256 0aa105919e78bbe5528135496b1b50085247a9423673d26af4acea34176764e0

See more details on using hashes here.

File details

Details for the file pyfastatools-2.5.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pyfastatools-2.5.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 2f0323489fc0348c01255917aee6f5880efb3e901fd7b85393d7ea85e734c9a4
MD5 d98c6edad31217b5536040f3844bfb57
BLAKE2b-256 32e8ede8063de725af7b3f232ccfabb0741eef44531556e0209828e0b48a6093

See more details on using hashes here.

File details

Details for the file pyfastatools-2.5.0-pp39-pypy39_pp73-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for pyfastatools-2.5.0-pp39-pypy39_pp73-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 dfd1aded4e119a97542fb4dda7d1e8966227857e3ba9a1a81728f972052c8378
MD5 3a901bd8f42370979a35a69ddba0bf2f
BLAKE2b-256 aa78a98fb290ed89bbe7546822fec86c4181676a10f28155beb47568184c440f

See more details on using hashes here.

File details

Details for the file pyfastatools-2.5.0-cp312-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for pyfastatools-2.5.0-cp312-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 33feacd8f98ce5780e04a3a5dfa4253c1bbb265e3fc7ab6685a4033be749c1f2
MD5 ce171c3b1761984f8fbe6868482e61b5
BLAKE2b-256 564cccac3f3d324ac2d567cf3a2ef757356cd683f124866a8b2f13012a0d2e9f

See more details on using hashes here.

File details

Details for the file pyfastatools-2.5.0-cp312-abi3-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for pyfastatools-2.5.0-cp312-abi3-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 96eb1786002870142677b9e8bc58818dd37a9097c9b21db74996f38ad755c582
MD5 4a64411ac068abe55d1393a4f6c3160f
BLAKE2b-256 1c68957ad61214d0ac0ad02bdfccf2bafef94e5b81545ee453d7b85fd7c7f11e

See more details on using hashes here.

File details

Details for the file pyfastatools-2.5.0-cp312-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pyfastatools-2.5.0-cp312-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 39acfeb9305749aed4693e7cd39f0e65592368d302e0d719ffd6d55ff345a3f4
MD5 31c1180f30ea3b747252c0d7fc04cfb9
BLAKE2b-256 0317cc50dcc9cf697c79982711f8222b540a86d49b62dc42d90c20350017267b

See more details on using hashes here.

File details

Details for the file pyfastatools-2.5.0-cp312-abi3-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for pyfastatools-2.5.0-cp312-abi3-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 23d9d57da7f598485198c82ccd7c8a40b49996bcf63ecffcddd9a2d0aaec96aa
MD5 01c77e97798d155913a38f3772870107
BLAKE2b-256 b99130c47bc7fca6e2e462f46d7cd967acfc323d413f52059d15df8283b59347

See more details on using hashes here.

File details

Details for the file pyfastatools-2.5.0-cp311-cp311-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for pyfastatools-2.5.0-cp311-cp311-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 780c7f8fa026dbb7c0a4c36c1ce835b0d8f05fe7e6d37f48db9775d9e2bac3a0
MD5 596d4d8ec87a82d3d29b4197a005c9ff
BLAKE2b-256 bbe79f62bb8c7835e2b8c1827fe9a104d1cb1b2ef530905552357ffc269d58eb

See more details on using hashes here.

File details

Details for the file pyfastatools-2.5.0-cp311-cp311-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for pyfastatools-2.5.0-cp311-cp311-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 4ab8bdb4fc965dc201d12b638a58d104755f1350be1507b6a1e6c4ea23283ae8
MD5 609fabf0d6e63790eb5359467e4439b7
BLAKE2b-256 5f4ce83df72c88e8eb3f01a89cf71013a6e22ececb4f59fb503abeb4c83c11be

See more details on using hashes here.

File details

Details for the file pyfastatools-2.5.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pyfastatools-2.5.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 369af614fb68b70a9a84c7f84927efbecb7edb9da307a1b1e699da71c391bc2b
MD5 3812eff37f6af03f8a5206dd5b8b7979
BLAKE2b-256 b884b40899390144e1cf3e3bb1a52fd189f97454a69e18cea63bb99ef28ad9a3

See more details on using hashes here.

File details

Details for the file pyfastatools-2.5.0-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for pyfastatools-2.5.0-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 5e681ca8186317e8dd87a4594e2d825b2eb04967184a66f473e1c79204066096
MD5 e685b4e04d8f4bd614b6a14f898cb165
BLAKE2b-256 5668c48096116cbc7135baa3aee3d92c769c8f8dc98f453cab6e94ef7ae00b85

See more details on using hashes here.

File details

Details for the file pyfastatools-2.5.0-cp310-cp310-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for pyfastatools-2.5.0-cp310-cp310-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 c81658256c4f4c9497ff1e93adbc91dbd18fc1f9ae7d77d409f12508632776e3
MD5 cebbbf4a87c304f3a0108af740813a85
BLAKE2b-256 045354dd5693c6aa17f24952f87004af701ebe51493f200d42e0e308c55d06e3

See more details on using hashes here.

File details

Details for the file pyfastatools-2.5.0-cp310-cp310-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for pyfastatools-2.5.0-cp310-cp310-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 0074628866264071e925c844d949dc99148fcfe2174b65ff8c3bf11e9bd617d0
MD5 b28278bc85d636bae6a120b690bb564e
BLAKE2b-256 83e48f852e1e8d9dc2d99fcf6f5f9226dd3c12ed4dd3b7500b5b57adda914081

See more details on using hashes here.

File details

Details for the file pyfastatools-2.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pyfastatools-2.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 76b3fc77e56f11fd3a9f9661964745fe9363b91ffe6e54e187956de8c139cd23
MD5 99929ec7408f74235d55a99f16508799
BLAKE2b-256 cdde05c42d0d494662e3f002d0db428cb665b8d97bd16cc0233cb2583348413a

See more details on using hashes here.

File details

Details for the file pyfastatools-2.5.0-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for pyfastatools-2.5.0-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 9c89f1752291c9e03b34e805c0e41c6d0cac80a6b7b9c2c6138ca543f06387d4
MD5 3875284c3ac038e69616d6f7286e2d40
BLAKE2b-256 f915553f63bc648aac220f7500144c12b850d3b7161f01d8710c8630828bf41e

See more details on using hashes here.

File details

Details for the file pyfastatools-2.5.0-cp39-cp39-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for pyfastatools-2.5.0-cp39-cp39-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 9d6970d815188bfd8172cb0e006db1973a1cb3b51482e33a0f0d31e9ca077abc
MD5 d69f6b8c373d078b06d717726e544ad1
BLAKE2b-256 98866a92afeeb522ae496afb3c573d7197283603cee6a8e04a79e8452cb4a3d0

See more details on using hashes here.

File details

Details for the file pyfastatools-2.5.0-cp39-cp39-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for pyfastatools-2.5.0-cp39-cp39-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 0c750eb7a67deb7b2e8a35998ae834086d293b106561d5a77142aa2873b02c17
MD5 e188d86cdf0841235e6b6214f9d5534b
BLAKE2b-256 ca2dd10198940f4befe20e6a55f095ae54012f356c2acfe5c3277ccbb96db23d

See more details on using hashes here.

File details

Details for the file pyfastatools-2.5.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pyfastatools-2.5.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f6fa07ec16796be85e5e149bccc1a43f7fd5a60b7d8b08ff833d9aeb7da0376a
MD5 5e020be75b6831c8fab60598e44a47a2
BLAKE2b-256 0bcdbca08bb4c78f8ffc9d87ea194cc9351ebda4d88dc2a3a72b66350d742336

See more details on using hashes here.

File details

Details for the file pyfastatools-2.5.0-cp39-cp39-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for pyfastatools-2.5.0-cp39-cp39-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 0b5dc93a48dffe4269f8c780f8a299fc3398d5f273189f8da13dc40bc1c2c06f
MD5 eda3da2e22c88701fcfde00bd211d52f
BLAKE2b-256 be8dcf2299125145160bfa13bcf1c9f372ae41597585ad44d0001d9799c32741

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page