Skip to main content

Algorithm for clustering protein multiple sequence alignments using normalized mutual information.

Project description

PSICalc Algorithm Package

This is a package for clustering Multiple Sequence Alignments (MSAs) utilizing normalized mutual information to examine protein subdomains. A complete data visualization tool for psicalc is available here: https://github.com/mandosoft/psi-calc.

As an example:

import psicalc as pc

file = "<your_fasta_file>" # e.g "PF02517_seed.txt"

data = pc.read_txt_file_format(file) # read Fasta file

data = pc.durston_schema(data, 1) # Label column index starting at 1

result = pc.find_clusters(7, data) # will sample every 7th column

pc.write_output_data(7, result)

The program will run and return a csv file with the strongest clusters found in the MSA provided.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

psicalc-0.2.3.tar.gz (6.7 kB view hashes)

Uploaded Source

Built Distribution

psicalc-0.2.3-py3-none-any.whl (6.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page