Skip to main content

Another Fastx Parser — tiny, dependency-free FASTA/FASTQ reader with transparent gzip/bzip2/zip/zstd decompression.

Project description

afp — Another Fastx Parser

CI

A tiny, dependency-free Python reader/writer for FASTA and FASTQ files with transparent gzip / bzip2 / zip / zstandard decompression. Standard library only — zstandard is an optional extra used only for .zst inputs.

The whole module is a single file: afp.py.

Install

Three ways to use it:

Drop the file into your project:

curl -O https://raw.githubusercontent.com/conchoecia/afp/main/afp.py
# put afp.py somewhere on your python path

Or pip-install:

pip install run-afp               # core, no extras
pip install "run-afp[zstd]"       # also read .zst-compressed files
pip install "run-afp[dev]"        # pytest + zstandard for development

The PyPI distribution is run-afp (the bare afp name was already taken). The import name stays import afp.

Or vendor inside another repo: copy afp.py into your dependencies/ directory, add that directory to sys.path, then import afp.

Quick start

import afp

# Auto-detects FASTA vs FASTQ from the first byte, and gzip/bzip2/zip/zstd
# compression from the file's magic bytes (not its extension).
for rec in afp.parse("reads.fq.gz"):
    print(rec.id, len(rec.seq), rec.qual[:10])

for rec in afp.parse("genome.fa"):
    print(rec.id, rec.desc, rec.seq[:50])

Force a specific format if needed:

for rec in afp.parse("weirdly_named_file", format="fasta"):
    ...

# Or use the explicit parsers:
afp.parse_fasta("genome.fa")
afp.parse_fastq("reads.fq")

The Record object

class Record:
    id: str             # token after '>' or '@', up to first whitespace
    seq: str            # sequence, newlines stripped
    desc: str | None    # everything after id on the header line (or None)
    qual: str | None    # quality string (FASTQ only; None for FASTA)

Records are mutable. You can rewrite record.id in place.

rec.format()           # back to FASTA / FASTQ text
rec.format(wrap=80)    # FASTA only: wrap sequence at 80 columns
len(rec)               # length of seq
"ACGT" in rec          # membership on the sequence string
list(rec)              # iterate over letters

Writing

afp.write(records, "out.fa")          # plain
afp.write(records, "out.fa.gz")       # auto-gzip from .gz extension
afp.write(records, "out.fq.gz")       # FASTQ if the first record has `qual`
afp.write(records, "out.fa", wrap=80) # wrapped FASTA

Mixing FASTA and FASTQ records in a single output stream is rejected.

Compression helpers

afp.detect_compression("file")   # 'gzip' | 'bzip2' | 'zip' | 'zstd' | 'none'
afp.get_open_func("file.gz")     # returns gzip.open

detect_compression reads only the first 4 bytes — it's cheap to call.

Why

Built for projects that want to vendor a single Python file rather than pull in a multi-megabyte sequence toolkit, under a permissive license they can carry through to their own code. The whole module is one ~400-line file, no external runtime dependencies, no compiled extensions.

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

run_afp-0.1.2.tar.gz (13.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

run_afp-0.1.2-py3-none-any.whl (7.8 kB view details)

Uploaded Python 3

File details

Details for the file run_afp-0.1.2.tar.gz.

File metadata

  • Download URL: run_afp-0.1.2.tar.gz
  • Upload date:
  • Size: 13.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for run_afp-0.1.2.tar.gz
Algorithm Hash digest
SHA256 fe6eb05039716ee30caa9d65841321c53edeb0ad49d41759d6c62526ed8e48f8
MD5 75facca6a2c75ee458043bdbb57c780c
BLAKE2b-256 c4fb1e5881e7c798c0f5312e9149e9847c88b4a19a81d2f7b13467a59bb08844

See more details on using hashes here.

Provenance

The following attestation bundles were made for run_afp-0.1.2.tar.gz:

Publisher: publish.yml on conchoecia/afp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file run_afp-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: run_afp-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 7.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for run_afp-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 72d27bd43d79be623db68bee43548f9546a02911243e699154a66bb88ff25038
MD5 c04a87a4761aa27d0d89f7d098e83296
BLAKE2b-256 17678977d05b4957bf4e7c1f73a5644b7a3f095164dc7289924a0a91a4811d36

See more details on using hashes here.

Provenance

The following attestation bundles were made for run_afp-0.1.2-py3-none-any.whl:

Publisher: publish.yml on conchoecia/afp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page