Skip to main content

Another Fastx Parser — tiny, dependency-free FASTA/FASTQ reader with transparent gzip/bzip2/zip/zstd decompression.

Project description

afp — Another Fastx Parser

CI Coverage PyPI Python versions License: MIT

Wheel Format PyPI downloads/month Total downloads

Bioconda Bioconda downloads Bioconda platform

Last commit Open issues DOI

A tiny, dependency-free Python reader/writer for FASTA and FASTQ files with transparent gzip / bzip2 / zip / zstandard decompression. Standard library only — zstandard is an optional extra used only for .zst inputs.

The whole module is a single file: afp.py.

Install

Three ways to use it:

Drop the file into your project:

curl -O https://raw.githubusercontent.com/conchoecia/afp/main/afp.py
# put afp.py somewhere on your python path

Or pip-install:

pip install run-afp               # core, no extras
pip install "run-afp[zstd]"       # also read .zst-compressed files
pip install "run-afp[dev]"        # pytest + zstandard for development

The PyPI distribution is run-afp (the bare afp name was already taken). The import name stays import afp.

Or vendor inside another repo: copy afp.py into your dependencies/ directory, add that directory to sys.path, then import afp.

Quick start

import afp

# Auto-detects FASTA vs FASTQ from the first byte, and gzip/bzip2/zip/zstd
# compression from the file's magic bytes (not its extension).
for rec in afp.parse("reads.fq.gz"):
    print(rec.id, len(rec.seq), rec.qual[:10])

for rec in afp.parse("genome.fa"):
    print(rec.id, rec.desc, rec.seq[:50])

Force a specific format if needed:

for rec in afp.parse("weirdly_named_file", format="fasta"):
    ...

# Or use the explicit parsers:
afp.parse_fasta("genome.fa")
afp.parse_fastq("reads.fq")

The Record object

class Record:
    id: str             # token after '>' or '@', up to first whitespace
    seq: str            # sequence, newlines stripped
    desc: str | None    # everything after id on the header line (or None)
    qual: str | None    # quality string (FASTQ only; None for FASTA)

Records are mutable. You can rewrite record.id in place.

rec.format()           # back to FASTA / FASTQ text
rec.format(wrap=80)    # FASTA only: wrap sequence at 80 columns
len(rec)               # length of seq
"ACGT" in rec          # membership on the sequence string
list(rec)              # iterate over letters

Writing

afp.write(records, "out.fa")          # plain
afp.write(records, "out.fa.gz")       # auto-gzip from .gz extension
afp.write(records, "out.fq.gz")       # FASTQ if the first record has `qual`
afp.write(records, "out.fa", wrap=80) # wrapped FASTA

Mixing FASTA and FASTQ records in a single output stream is rejected.

Compression helpers

afp.detect_compression("file")   # 'gzip' | 'bzip2' | 'zip' | 'zstd' | 'none'
afp.get_open_func("file.gz")     # returns gzip.open

detect_compression reads only the first 4 bytes — it's cheap to call.

Why

Built for projects that want to vendor a single Python file rather than pull in a multi-megabyte sequence toolkit, under a permissive license they can carry through to their own code. The whole module is one ~400-line file, no external runtime dependencies, no compiled extensions.

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

run_afp-0.1.4.tar.gz (14.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

run_afp-0.1.4-py3-none-any.whl (8.2 kB view details)

Uploaded Python 3

File details

Details for the file run_afp-0.1.4.tar.gz.

File metadata

  • Download URL: run_afp-0.1.4.tar.gz
  • Upload date:
  • Size: 14.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for run_afp-0.1.4.tar.gz
Algorithm Hash digest
SHA256 177f5e03d44c7686693496b97df9b39e7eaa2200be3b518af2b2d5c52bac452e
MD5 3e68422d7d38652be8d55a7b9aa43056
BLAKE2b-256 a7c16f0a84050a9a36e3edfb0e5dda5b5374003fcf02dbeaa413f6f9d2e03226

See more details on using hashes here.

Provenance

The following attestation bundles were made for run_afp-0.1.4.tar.gz:

Publisher: publish.yml on conchoecia/afp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file run_afp-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: run_afp-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 8.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for run_afp-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 8911c3b825913d6075ce72354b0d99e93edf46d3aea847842a59c079cdccd14b
MD5 70c50768f1fa6a8bc2a536b4c3d0bef7
BLAKE2b-256 751f63bcaab029c94fc841f605e930c9663981ba2d1b74a6f42f7a534fda0aac

See more details on using hashes here.

Provenance

The following attestation bundles were made for run_afp-0.1.4-py3-none-any.whl:

Publisher: publish.yml on conchoecia/afp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page