Skip to main content

Make sequence logos using Felsenstain's phylogenetically independent contrast metod to take evolution into account

Project description

PhyInC Logo: Phylogenetically Independent Contrasts Sequence Logo

PhyInC Logo (pronounced "Fink" for short) is a tool to take Fasta files and Newick tree files to create a phylogenetically conditioned sequence logo, using Felsensteins phylogenetic independent contrast (PIC).

Install from PyPI.org

pip3 install phyinc

It is always a good idea to use virtual environments, see below.

Install from a downloaded GitHub repository

For dependencies, start a virtual env as good practice:

> python3 -m venv .
> source venv/bin/activate
> pip3 install biopython weblogo matplotlib numpy

If you are on mac

> brew install ghostscript

To install the current package:

> pip3 install .

Usage

The basic usage is as follows.

> phyinc treefile fastafile

Since no outfile is given, the logo is output to fastafile.fa_logo.pdf. You can decide outputfile and format using the -o option:

> phyinc -o the_logo.png treefile fastafile

Examples

There is example data in the github repository, and there you can run this command:

> phyinc examples/synthetic_data/ex1_t1.tree examples/synthetic_data/ex1.fa

This should create a PDF named "ex1.fa_seqlogo.pdf" in the examples folder.

There are cases when a phylogeny is created on proteins but the logo is created for domains and this may cause protein accessions be something like ETA_STAAU but the domain accession is ETA_STAAU/96-110. You can then use the --coords option to ignore the domain coordinates when mapping the domain to a tree.

> phyinc --coords examples/PF000672.fa examples/PF000672.treefile

However, phyinc will report an error if there more than one domain per protein:

> phyinc --coords PS00027.fa PS00027.treefile
Error: 'ZFH2_DROME' is a protein appearing twice, probably because you have two 
domains from the same protein in the input. If so, you must submit a tree inferred
on the domain sequences, not on the proteins.

License

GPLv3

Authors

  • Haolin Guo wrote the basic code as part of his BSc project.
  • Kyle Tenn helped make the code into a Python/PyPI package.
  • Lars Arvestad oversaw and helped with creating the final Python/PyPI package.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

phyinc-1.4.tar.gz (29.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

phyinc-1.4-py3-none-any.whl (25.0 kB view details)

Uploaded Python 3

File details

Details for the file phyinc-1.4.tar.gz.

File metadata

  • Download URL: phyinc-1.4.tar.gz
  • Upload date:
  • Size: 29.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for phyinc-1.4.tar.gz
Algorithm Hash digest
SHA256 ab2acb90e0b4dff561e0a6571c6b182184f82cac7967f49eb5e23a5002736b65
MD5 fdd91e0ad4cb107f4d7d432086f7e37f
BLAKE2b-256 151005872474706aac5211a3c18d79266499edb1d3f0860e4a1c43c5f0c9f6fe

See more details on using hashes here.

File details

Details for the file phyinc-1.4-py3-none-any.whl.

File metadata

  • Download URL: phyinc-1.4-py3-none-any.whl
  • Upload date:
  • Size: 25.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for phyinc-1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 ac7f4a2ca1c0e833bc480b0af997f3fcc56b0a290929cc78d48ed054702ac4ed
MD5 66c40e7717a0a9b12403bf75ecb15837
BLAKE2b-256 07ba35a2568c23cba39f79d466a02f3dc2ef9ad330ca1ef098ec5c68d632fd5d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page