Skip to main content

Simple FASTA file tools to parse, edit, subset, split, and perform stats on FASTA files

Project description

FASTA file parsing written in C++ with Python bindings

Installation

pip install pyfastatools

Usage

The pyfastatools.Parser object is the primary API that parses FASTA files and yields pyfastatools.Record objects.

If you have a FASTA file called proteins.faa that looks like this:

>seq_1
MSKFKKIPL
>seq_2
MQSSSKTCN
>seq_3
MEDNMITIY

Then you can parse this file in python like this:

from pyfastatools import Parser

for record in Parser("proteins.faa"):
    print(record.header.name, record.seq)

which will print:

>>> 'seq_1 MSKFKKIPL'
>>> 'seq_2 MQSSSKTCN'
>>> 'seq_3 MEDNMITIY'

API

This library has a very simple API that can be displayed in a few lines:

Parser

This is the main class that will satisfy 99% of user needs. While parsing FASTA files, it produces Record objects. Only the name of a FASTA file is needed:

pyfastatools.Parser("my_fasta.fasta")

The parser will attempt to auto-detect the RecordType of the file by checking the input file extension and the first 5 sequences.

However, the record type can optionally be specified:

pyfastatools.Parser("my_fasta.fasta", pyfastatools.RecordType.PROTEIN)

The parser can be iterated over to yield one Record at a time:

parser = pyfastatools.Parser("my_fasta.fasta")
for record in parser:
    ...

Methods

There are also other convenience methods:

  • all - Read all records into a list-like object.
  • take - Take up to n records into a list-like object.
  • filter - Keep/exclude sequences based on the sequence name.
  • remove_stops - Yield sequences without a * stop codon character if the sequences are proteins.
  • clean_header - Yield sequences while cleaning the header to not have a description.
  • headers - Yield Header objects only without parsing the sequence itself.
  • all_headers - Return all headers into a list-like object.

Properties

  • num_records - Returns the number of sequences in the FASTA file. This is cached after the first time it is called. Note: This can also be computed using len(parser)
  • format - Returns the RecordType enum that corresponds to the FASTA file's record type
  • extension - Returns the file extension based on the format

Record

A single FASTA record. It has the following fields:

  • header - A Header object that has the fields name and desc
  • seq - A str storing the entire sequence

Methods

  • empty - Checks if the Header and sequence are empty
  • clear - Sets the Header and sequence to empty strings
  • to_string - Returns the record as a string representation identical to what was parsed from the file
  • clean_header - Sets the Header description to an empty string
  • remove_stops - Removes * stop codon characters from the sequence if they are present

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyfastatools-2.3.0.tar.gz (57.8 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pyfastatools-2.3.0-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (148.5 kB view details)

Uploaded PyPymanylinux: glibc 2.17+ x86-64

pyfastatools-2.3.0-pp310-pypy310_pp73-manylinux_2_17_i686.manylinux2014_i686.whl (157.9 kB view details)

Uploaded PyPymanylinux: glibc 2.17+ i686

pyfastatools-2.3.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (148.4 kB view details)

Uploaded PyPymanylinux: glibc 2.17+ x86-64

pyfastatools-2.3.0-pp39-pypy39_pp73-manylinux_2_17_i686.manylinux2014_i686.whl (157.9 kB view details)

Uploaded PyPymanylinux: glibc 2.17+ i686

pyfastatools-2.3.0-cp312-abi3-musllinux_1_2_x86_64.whl (608.7 kB view details)

Uploaded CPython 3.12+musllinux: musl 1.2+ x86-64

pyfastatools-2.3.0-cp312-abi3-musllinux_1_2_i686.whl (662.1 kB view details)

Uploaded CPython 3.12+musllinux: musl 1.2+ i686

pyfastatools-2.3.0-cp312-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (152.1 kB view details)

Uploaded CPython 3.12+manylinux: glibc 2.17+ x86-64

pyfastatools-2.3.0-cp312-abi3-manylinux_2_17_i686.manylinux2014_i686.whl (162.5 kB view details)

Uploaded CPython 3.12+manylinux: glibc 2.17+ i686

pyfastatools-2.3.0-cp311-cp311-musllinux_1_2_x86_64.whl (609.4 kB view details)

Uploaded CPython 3.11musllinux: musl 1.2+ x86-64

pyfastatools-2.3.0-cp311-cp311-musllinux_1_2_i686.whl (662.9 kB view details)

Uploaded CPython 3.11musllinux: musl 1.2+ i686

pyfastatools-2.3.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (152.8 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

pyfastatools-2.3.0-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl (162.6 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ i686

pyfastatools-2.3.0-cp310-cp310-musllinux_1_2_x86_64.whl (608.9 kB view details)

Uploaded CPython 3.10musllinux: musl 1.2+ x86-64

pyfastatools-2.3.0-cp310-cp310-musllinux_1_2_i686.whl (662.4 kB view details)

Uploaded CPython 3.10musllinux: musl 1.2+ i686

pyfastatools-2.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (152.5 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

pyfastatools-2.3.0-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl (162.2 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ i686

pyfastatools-2.3.0-cp39-cp39-musllinux_1_2_x86_64.whl (609.3 kB view details)

Uploaded CPython 3.9musllinux: musl 1.2+ x86-64

pyfastatools-2.3.0-cp39-cp39-musllinux_1_2_i686.whl (662.6 kB view details)

Uploaded CPython 3.9musllinux: musl 1.2+ i686

pyfastatools-2.3.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (152.6 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

pyfastatools-2.3.0-cp39-cp39-manylinux_2_17_i686.manylinux2014_i686.whl (162.5 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ i686

File details

Details for the file pyfastatools-2.3.0.tar.gz.

File metadata

  • Download URL: pyfastatools-2.3.0.tar.gz
  • Upload date:
  • Size: 57.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.19

File hashes

Hashes for pyfastatools-2.3.0.tar.gz
Algorithm Hash digest
SHA256 49a43599cf23e790b666ef2838ff558feefefe6af6b7e2aa8c12867b65fe579d
MD5 228ac56b2f95aff66a32d90c8cc19852
BLAKE2b-256 4cef71fe3aecc7b9418411d49f5375cbc6c731cc35603e8c1933741e7b86e03d

See more details on using hashes here.

File details

Details for the file pyfastatools-2.3.0-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pyfastatools-2.3.0-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 44eab9490286b002eef7ce6387758cdd44c1d3b08d8dc060e7c175068d2311db
MD5 c6a2119f183e38f3780a6d287b296511
BLAKE2b-256 df4e8eb4cdd748c58fc1a117e0f8035116f0e2465447067bc5ffbdd8b4f8f44e

See more details on using hashes here.

File details

Details for the file pyfastatools-2.3.0-pp310-pypy310_pp73-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for pyfastatools-2.3.0-pp310-pypy310_pp73-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 f937b88dba61137deacd02f73e72927ce0d4bcbe2573487407f1ed3fb6c894ca
MD5 8ce9606ec46ba1dfa3c712a42770a675
BLAKE2b-256 1dac8f5d348d22fd7458888dc33e076d6a5e08783c3b2fc8c9ca0177882d51d8

See more details on using hashes here.

File details

Details for the file pyfastatools-2.3.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pyfastatools-2.3.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 2336c22fb0750c60cbcac88ba9c1e70a87a06db65c56c09215b3b9950ecf39c5
MD5 537f207b22066cf69eb056056caab992
BLAKE2b-256 85d01cbbaf99e21b265871867230bddaba04e6483a1ca222d43a1c8bcaef0e0f

See more details on using hashes here.

File details

Details for the file pyfastatools-2.3.0-pp39-pypy39_pp73-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for pyfastatools-2.3.0-pp39-pypy39_pp73-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 54d327660262dffce638f97174b37d40378a1bf20d90548f8b05426f5f572e21
MD5 acb648aa6f6e9465e4132119216999a4
BLAKE2b-256 bdfd3fd9aa54c03f8924c78df3cf6541ef09e2365f729f52dfb5713281e774cc

See more details on using hashes here.

File details

Details for the file pyfastatools-2.3.0-cp312-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for pyfastatools-2.3.0-cp312-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 21dd2b9397698080fa5983930f6c457023747eaa1d037841618ba21908e7e755
MD5 6f5ea8258701d8904ba4a8b8c439479d
BLAKE2b-256 6dac875427f5351b277760c3ddf31d29df86f4562068441b1367a0553e3ab321

See more details on using hashes here.

File details

Details for the file pyfastatools-2.3.0-cp312-abi3-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for pyfastatools-2.3.0-cp312-abi3-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 a9018e255763845954a55f99e0df546c02b5a718c52c114826d7e4476143d6b5
MD5 efd23266952c5d412a01eaad2e50f4c6
BLAKE2b-256 47fe34aff6ddd21d1240efb7ca323d87f284b8fe844e4c6aca872e7e6e53d1bf

See more details on using hashes here.

File details

Details for the file pyfastatools-2.3.0-cp312-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pyfastatools-2.3.0-cp312-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 bf4aaa18441aa7e8b0c5a5a6dea1f1f5a622506e204647608c4a57ce636085ec
MD5 907af37cc5fb4ca772b142444edf05be
BLAKE2b-256 db8b211d50a41d78f268db0f047c1c6855f0038ccdcf5e599d1531a1770c3e80

See more details on using hashes here.

File details

Details for the file pyfastatools-2.3.0-cp312-abi3-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for pyfastatools-2.3.0-cp312-abi3-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 5a29adaa2e0a5a1fd2449972131b576ca04e0f1f3977fc03c682b6acb41fea17
MD5 0247b1231f1ae01beb34eacfb05b4027
BLAKE2b-256 a56c7cc72fcd0404c80fca05659de2d34415fb7d9623ca627051719497bc5f24

See more details on using hashes here.

File details

Details for the file pyfastatools-2.3.0-cp311-cp311-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for pyfastatools-2.3.0-cp311-cp311-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 ef9612f891682fa733207c0bd2b04f72060cecf6d7e8aabf15591aecc9ebd06c
MD5 4acc7b8227afbb5c4dfe3ec5a8938d4a
BLAKE2b-256 f9ff0a8b9717aec23f46ec70b6c2bc180c2d7a02a2b2d97f215b240497d4439d

See more details on using hashes here.

File details

Details for the file pyfastatools-2.3.0-cp311-cp311-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for pyfastatools-2.3.0-cp311-cp311-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 ff36ee47c41c64e07054e7eaddc14b4cf0c2e6d3d4b23e63007db410dbbbec70
MD5 a5483976564e2cf638f4378e320a0e2e
BLAKE2b-256 b7ba5c0901a12f9c9a317187e66083d6976c5acd9c95e5ef37cfe9df9bfde76d

See more details on using hashes here.

File details

Details for the file pyfastatools-2.3.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pyfastatools-2.3.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f445c6a0a9a3df3b087fa746dabdfe88fd64c2ff8abada7a03e023f05b46877e
MD5 f927cd1270d567dd247413ccd3e8618f
BLAKE2b-256 cb7ca97137d11f44dbdff2b7570abf2d1cb9d2605f0f6b94b5487f42d085282f

See more details on using hashes here.

File details

Details for the file pyfastatools-2.3.0-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for pyfastatools-2.3.0-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 bdd96fcce97ebed2eb287d505919f248b8331e5a1481c390ad0158d65fed0b06
MD5 e05965105cf24db84f743160f70cb3cd
BLAKE2b-256 764757e91cbb6b893b5089de75729f6f69f340ed6ca208dcf29b6fb2fc0dc9ac

See more details on using hashes here.

File details

Details for the file pyfastatools-2.3.0-cp310-cp310-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for pyfastatools-2.3.0-cp310-cp310-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 afc673eb83740d3d9baf67ac856bb7d13b1afba366cd36046e0c7826e719b56c
MD5 eacc9ee1b83862d9718c1ca075004e93
BLAKE2b-256 5180853a70a1289dde6ceeb21418970bba501f98e848e57580c5b42c4f04cf5d

See more details on using hashes here.

File details

Details for the file pyfastatools-2.3.0-cp310-cp310-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for pyfastatools-2.3.0-cp310-cp310-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 d5f27cee946e2b5b602fdbf3171bedee8db0e8cdecbe2c8de7494d3affe6fa1d
MD5 3207944c9283fb1ef7d217b6a16cda54
BLAKE2b-256 0593cdaf4d3bfaa9227b224b0fa477e538023b00a5bb5d93dca7b82d69244d6c

See more details on using hashes here.

File details

Details for the file pyfastatools-2.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pyfastatools-2.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 350729cd34ab2b11227e8ba6a72044d91b21b07b5af3ae1a938846d5cfec8912
MD5 bccc24e6e20625a58bba8093319911a5
BLAKE2b-256 7d2dbb86b2d5a55aa02ae23ded930028d148f759062084cc1922760b8d0bc287

See more details on using hashes here.

File details

Details for the file pyfastatools-2.3.0-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for pyfastatools-2.3.0-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 4834a41a5e0af2babf044b9999c99d1d4ac1b0654df141f8e036547d8e388271
MD5 aa54e636a50072e8f6119674697fafee
BLAKE2b-256 fbe226faa017189b0119cca6093f2ff6db756965d34403a9fcf28ee798859c42

See more details on using hashes here.

File details

Details for the file pyfastatools-2.3.0-cp39-cp39-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for pyfastatools-2.3.0-cp39-cp39-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 338ac1b3b8199f3cdedb7ef424275b960bbf364c035513ceff33c6f6492e4f18
MD5 36d225c3d3eb58b86a77c3cee6680ace
BLAKE2b-256 c5f0343ed5de5c5b896417d1a12b266adc202141e98568c24b88da8fb4011689

See more details on using hashes here.

File details

Details for the file pyfastatools-2.3.0-cp39-cp39-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for pyfastatools-2.3.0-cp39-cp39-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 95d70bfd6e7c9f42a4f0722b7caffec0f0bd0686d701ebd662d03631ece81073
MD5 90385957bdf0926c63da606eed2256e8
BLAKE2b-256 585840b01b4a55da80b1ce4a8fb1771d3139831d84334a76190316f3c5f0273f

See more details on using hashes here.

File details

Details for the file pyfastatools-2.3.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pyfastatools-2.3.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 bb4cefd1ed93763e7f5d843d79b2cfdc4bbe4f1c99b52464c8da456ad8508869
MD5 887b9714686c79a1ba2a2afbe642149a
BLAKE2b-256 ba230e43bc1178c4ca62045a18d91c651da9ff79692a6282f48b1181d5359565

See more details on using hashes here.

File details

Details for the file pyfastatools-2.3.0-cp39-cp39-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for pyfastatools-2.3.0-cp39-cp39-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 84d073f1cb5925705bdec718afed1b03b7b3fc122a220cc8cf2f483d89aacd77
MD5 697e05e28272192aeafcf2a94e175a10
BLAKE2b-256 71128f2987c2d325e2d07c5b231d0064789a71af391e28da7949bd97f5af9659

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page