Skip to main content

Make sequence logos using Felsenstain's phylogenetically independent contrast metod to take evolution into account

Project description

PhyInC: Phylogenetically Independent Contrasts Sequence Logo

PhyInC (pronounced "Fink" for short) is a tool to take Fasta files and Newick tree files to create a phylogenetically conditioned sequence logo, using Felsensteins phylogenetic independent contrast (PIC).

Install from PyPI.org

pip3 install phyinc

It is always a good idea to use virtual environments, see below.

Install from a downloaded GitHub repository

For dependencies, start a virtual env as good practice:

> python3 -m venv .
> source venv/bin/activate
> pip3 install biopython weblogo matplotlib numpy

If you are on mac

> brew install ghostscript

To install the current package:

> pip3 install .

Usage

The basic usage is as follows.

> phyinc treefile fastafile

Since no outfile is given, the logo is output to fastafile.fa_logo.pdf. You can decide outputfile and format using the -o option:

> phyinc -o the_logo.png treefile fastafile

Examples

There is example data in the github repository, and there you can run this command:

> phyinc examples/synthetic_data/ex1_t1.tree examples/synthetic_data/ex1.fa

This should create a PDF named "ex1.fa_seqlogo.pdf" in the examples folder.

There are cases when a phylogeny is created on proteins but the logo is created for domains and this may cause protein accessions be something like ETA_STAAU but the domain accession is ETA_STAAU/96-110. You can then use the --coords option to ignore the domain coordinates when mapping the domain to a tree.

> phyinc --coords examples/PF000672.fa examples/PF000672.treefile

However, phyinc will report an error if there more than one domain per protein:

> phyinc --coords PS00027.fa PS00027.treefile
Error: 'ZFH2_DROME' is a protein appearing twice, probably because you have two 
domains from the same protein in the input. If so, you must submit a tree inferred
on the domain sequences, not on the proteins.

License

GPLv3

Authors

  • Haolin Guo wrote the basic code as part of his BSc project.
  • Kyle Tenn helped make the code into a Python/PyPI package.
  • Lars Arvestad oversaw and helped with creating the final Python/PyPI package.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

phyinc-1.1.tar.gz (25.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

phyinc-1.1-py3-none-any.whl (21.6 kB view details)

Uploaded Python 3

File details

Details for the file phyinc-1.1.tar.gz.

File metadata

  • Download URL: phyinc-1.1.tar.gz
  • Upload date:
  • Size: 25.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for phyinc-1.1.tar.gz
Algorithm Hash digest
SHA256 2180e0fd83f7dd1612ea477ca930dc1cdf288f9d8bc42f5dba5b6fcd96ef8030
MD5 b7fc2c9b593ced568c82aa5275f267d5
BLAKE2b-256 17a366f6e43b3b2cb75e6f4123ebad944959ba2e091eb5661f073bcce6a0492c

See more details on using hashes here.

File details

Details for the file phyinc-1.1-py3-none-any.whl.

File metadata

  • Download URL: phyinc-1.1-py3-none-any.whl
  • Upload date:
  • Size: 21.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for phyinc-1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d54a4be9298c1941a6afb2f567055c970fa8a499d9d1b6d49cf93e5b5cf5773a
MD5 4e0c614bc874ce8fe2e14c73759ccf95
BLAKE2b-256 6873fd537f345b31a815db476094ae963f67652e2790079ee1f95a89896f49da

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page