minimap2 PAF file reader
pip install readpaf
pip install readpaf[pandas]
Direct downloadAs readpaf is a self contained module it can be installed by downloading just the module. The latest version is available from:
or a specific version can be downloaded from a release/tag like so:
PyPI is the recommended install method.
readpaf only has one user function,
parse_paf that accepts of file-like object; this
is any object in python that has a file-oriented API (
stdout from subprocess,
io.StringIO, open files from
The following script demonstrates how minimap2 output can be piped into readpaf
from readpaf import parse_paf from sys import stdin for record in parse_paf(stdin): print(record.query_name, record.target_name)
readpaf can also generate a pandas DataFrame:
from readpaf import parse_paf with open("test.paf", "r") as handle: df = parse_paf(handle, dataframe=True)
readpaf has a single user function
parse_paf(file_like=file_handle, fields=list, na_values=list, na_rep=numeric, dataframe=bool)
- file_like: A file like object, such as
sys.stdin, a file handle from open or io.StringIO objects
- fields: A list of 13 field names to use for the PAF file, default:
"query_name", "query_length", "query_start", "query_end", "strand", "target_name", "target_length", "target_start", "target_end", "residue_matches", "alignment_block_length", "mapping_quality", "tags"These are based on the PAF specification.
- na_values: A list of values to interpret as NaN. This is only applied to numeric fields, default:
- na_rep: Value to use when a NaN value specified in
na_valuesis found. This should ideally be
0to match minimap2's output default:
- dataframe: bool, if True, return a pandas.DataFrame with the tags expanded into separate Series
If used as an iterator, then each object returned is a named tuple representing a single line in the PAF file.
Each named tuple has field names as specified by the
The SAM-like tags are converted into their specified types and stored in a dictionary with the tag name as the key and the value a named tuple with fields
str are called on
PAF record (named tuple) a formated PAF string is returned, which is useful for writing records to a file.
PAF record also has a method
blast_identity which calculates the blast identity for that record.
If used to generate a pandas DataFrame, then each row represents a line in the PAF file and the SAM-like tags are expanded into individual series.
Release history Release notifications | RSS feed
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size readpaf-0.0.8-py2.py3-none-any.whl (6.1 kB)||File type Wheel||Python version py2.py3||Upload date||Hashes View|
|Filename, size readpaf-0.0.8.tar.gz (5.9 kB)||File type Source||Python version None||Upload date||Hashes View|
Hashes for readpaf-0.0.8-py2.py3-none-any.whl