Skip to main content

A fast tool to calculate Hamming distances

Project description

A small C++ tool to calculate pairwise distances between gene sequences given in fasta format.

DOI pypi releases python versions

Python interface

To use the Python interface, you should install it from PyPI:

python -m pip install hammingdist

Then, you can e.g. use it in the following way from Python:

import hammingdist

# To see the different optional arguments available:
help(hammingdist.from_fasta)

# To import all sequences from a fasta file
data = hammingdist.from_fasta("example.fasta")

# To import only the first 100 sequences from a fasta file
data = hammingdist.from_fasta("example.fasta", n=100)

# To import all sequences and remove any duplicates
data = hammingdist.from_fasta("example.fasta", remove_duplicates=True)

# To import all sequences from a fasta file, also treating 'X' as a valid character
data = hammingdist.from_fasta("example.fasta", include_x=True)

# The distance data can be accessed point-wise, though looping over all distances might be quite inefficient
print(data[14,42])

# The data can be written to disk in csv format (default `distance` Ripser format) and retrieved:
data.dump("backup.csv")
retrieval = hammingdist.from_csv("backup.csv")

# It can also be written in lower triangular format (comma-delimited row-major, `lower-distance` Ripser format):
data.dump_lower_triangular("lt.txt")
retrieval = hammingdist.from_lower_triangular("lt.txt")

# If the `remove_duplicates` option was used, the sequence indices can also be written.
# For each input sequence, this prints the corresponding index in the output:
data.dump_sequence_indices("indices.txt")

# Finally, we can pass the data as a list of strings in Python:
data = hammingdist.from_stringlist(["ACGTACGT", "ACGTAGGT", "ATTTACGT"])

OpenMP on linux

The latest version of hammingdist on linux is now built with OpenMP (multithreading) support. If this causes any issues, you can install the previous version of hammingdist without OpenMP support:

pip install hammingdist==0.11.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for hammingdist, version 0.12.0
Filename, size File type Python version Upload date Hashes
Filename, size hammingdist-0.12.0-cp36-cp36m-macosx_10_9_x86_64.whl (83.4 kB) File type Wheel Python version cp36 Upload date Hashes View
Filename, size hammingdist-0.12.0-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (158.4 kB) File type Wheel Python version cp36 Upload date Hashes View
Filename, size hammingdist-0.12.0-cp36-cp36m-win32.whl (78.3 kB) File type Wheel Python version cp36 Upload date Hashes View
Filename, size hammingdist-0.12.0-cp36-cp36m-win_amd64.whl (89.4 kB) File type Wheel Python version cp36 Upload date Hashes View
Filename, size hammingdist-0.12.0-cp37-cp37m-macosx_10_9_x86_64.whl (83.3 kB) File type Wheel Python version cp37 Upload date Hashes View
Filename, size hammingdist-0.12.0-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (158.5 kB) File type Wheel Python version cp37 Upload date Hashes View
Filename, size hammingdist-0.12.0-cp37-cp37m-win32.whl (78.3 kB) File type Wheel Python version cp37 Upload date Hashes View
Filename, size hammingdist-0.12.0-cp37-cp37m-win_amd64.whl (89.4 kB) File type Wheel Python version cp37 Upload date Hashes View
Filename, size hammingdist-0.12.0-cp38-cp38-macosx_10_9_x86_64.whl (83.9 kB) File type Wheel Python version cp38 Upload date Hashes View
Filename, size hammingdist-0.12.0-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (157.8 kB) File type Wheel Python version cp38 Upload date Hashes View
Filename, size hammingdist-0.12.0-cp38-cp38-win32.whl (77.2 kB) File type Wheel Python version cp38 Upload date Hashes View
Filename, size hammingdist-0.12.0-cp38-cp38-win_amd64.whl (88.9 kB) File type Wheel Python version cp38 Upload date Hashes View
Filename, size hammingdist-0.12.0-cp39-cp39-macosx_10_9_x86_64.whl (84.0 kB) File type Wheel Python version cp39 Upload date Hashes View
Filename, size hammingdist-0.12.0-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (157.9 kB) File type Wheel Python version cp39 Upload date Hashes View
Filename, size hammingdist-0.12.0-cp39-cp39-win32.whl (77.4 kB) File type Wheel Python version cp39 Upload date Hashes View
Filename, size hammingdist-0.12.0-cp39-cp39-win_amd64.whl (88.3 kB) File type Wheel Python version cp39 Upload date Hashes View
Filename, size hammingdist-0.12.0-pp37-pypy37_pp73-macosx_10_9_x86_64.whl (83.4 kB) File type Wheel Python version pp37 Upload date Hashes View
Filename, size hammingdist-0.12.0-pp37-pypy37_pp73-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (157.6 kB) File type Wheel Python version pp37 Upload date Hashes View
Filename, size hammingdist-0.12.0-pp37-pypy37_pp73-win_amd64.whl (88.1 kB) File type Wheel Python version pp37 Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page