Package to calculate a distance matrix from a multiple sequence file
This small utility package will calculate number of differences between all samples in a fasta alignment file. It will count any position where there is a G,A,T or C (case insensitive) in both sequences that differ as 1 SNV.
Output formats are a square distance matrix in tsv, csv or phylip formats It is fast since it first converts sequences to bit arrays and then uses fast bit operations to calculate the differences.
On a mid-range laptop a distance matrix was produced in 11 minutes from a 764 sequence alignment of length 1,082,859 using -p 1 and 4.5 minutes with -p 4
FastaDist is available as PyPi package for Python3
pip3 install fastadist
usage: fastadist [-h] -i ALIGNMENT_FILEPATH [-t TREE_FILEPATH] -o OUTPUT_FILEPATH [-f FORMAT] [-p PARALLEL_PROCESSES] [-v] A script to calculate distances by converting sequences to bit arrays. Specify number of processes as -p N to speed up the calculation optional arguments: -h, --help show this help message and exit -i ALIGNMENT_FILEPATH, --alignment_filepath ALIGNMENT_FILEPATH path to multiple sequence alignment input file -t TREE_FILEPATH, --tree_filepath TREE_FILEPATH path to newick tree for distance matrix ordering -o OUTPUT_FILEPATH, --output_filepath OUTPUT_FILEPATH path to distance matrix output file -f FORMAT, --format FORMAT output format for distance matrix (one of tsv [default], csv and phylip -p PARALLEL_PROCESSES, --parallel_processes PARALLEL_PROCESSES number of parallel processes to run (default 1) -v, --version print out software version
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size FastaDist-1.0.1-py3-none-any.whl (6.3 kB)||File type Wheel||Python version py3||Upload date||Hashes View|
|Filename, size FastaDist-1.0.1.tar.gz (18.1 kB)||File type Source||Python version None||Upload date||Hashes View|
Hashes for FastaDist-1.0.1-py3-none-any.whl