Skip to main content

Make sequence logos using Felsenstain's phylogenetically independent contrast metod to take evolution into account

Project description

PhyInC: Phylogenetically Independent Contrasts Sequence Logo

PhyInC (pronounced "Fink" for short) is a tool to take Fasta files and Newick tree files to create a phylogenetically conditioned sequence logo, using Felsensteins phylogenetic independent contrast (PIC).

Install from PyPI.org

pip3 install phyinc

It is always a good idea to use virtual environments, see below.

Install from a downloaded GitHub repository

For dependencies, start a virtual env as good practice:

> python3 -m venv .
> source venv/bin/activate
> pip3 install biopython weblogo matplotlib numpy

If you are on mac

> brew install ghostscript

To install the current package:

> pip3 install .

Usage

The basic usage is as follows.

> phyinc treefile fastafile

Since no outfile is given, the logo is output to fastafile.fa_logo.pdf. You can decide outputfile and format using the -o option:

> phyinc -o the_logo.png treefile fastafile

Examples

There is example data in the github repository, and there you can run this command:

> phyinc examples/synthetic_data/ex1_t1.tree examples/synthetic_data/ex1.fa

This should create a PDF named "ex1.fa_seqlogo.pdf" in the examples folder.

There are cases when a phylogeny is created on proteins but the logo is created for domains and this may cause protein accessions be something like ETA_STAAU but the domain accession is ETA_STAAU/96-110. You can then use the --coords option to ignore the domain coordinates when mapping the domain to a tree.

> phyinc --coords examples/PF000672.fa examples/PF000672.treefile

However, phyinc will report an error if there more than one domain per protein:

> phyinc --coords PS00027.fa PS00027.treefile
Error: 'ZFH2_DROME' is a protein appearing twice, probably because you have two 
domains from the same protein in the input. If so, you must submit a tree inferred
on the domain sequences, not on the proteins.

License

GPLv3

Authors

  • Haolin Guo wrote the basic code as part of his BSc project.
  • Kyle Tenn helped make the code into a Python/PyPI package.
  • Lars Arvestad oversaw and helped with creating the final Python/PyPI package.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

phyinc-1.3.tar.gz (25.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

phyinc-1.3-py3-none-any.whl (22.5 kB view details)

Uploaded Python 3

File details

Details for the file phyinc-1.3.tar.gz.

File metadata

  • Download URL: phyinc-1.3.tar.gz
  • Upload date:
  • Size: 25.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for phyinc-1.3.tar.gz
Algorithm Hash digest
SHA256 d6dc7359f24be082a787547173b5c05f6876e3b87aba0ca4780882e9489b65c2
MD5 cbd5fc2a4960a3c153ab89bb1db23eaa
BLAKE2b-256 09369c5d316dfd3cd44d7689a5beb4817604bbae08f3fb84895c65f8df716c10

See more details on using hashes here.

File details

Details for the file phyinc-1.3-py3-none-any.whl.

File metadata

  • Download URL: phyinc-1.3-py3-none-any.whl
  • Upload date:
  • Size: 22.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for phyinc-1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 d249af7f4326e9c2dee975c45b2f9b9c78a6e43e00ee961c47ba27901699d86e
MD5 743c975f074749368c9766497129afb0
BLAKE2b-256 c2338ed9ab45b551ed7608371630541e89b70f6806aaf7e959dfce6c920c9574

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page