Make sequence logos using Felsenstain's phylogenetically independent contrast metod to take evolution into account
Project description
PhyInC: Phylogenetically Independent Contrasts Sequence Logo
PhyInC (pronounced "Fink" for short) is a tool to take Fasta files and Newick tree files to create a phylogenetically conditioned sequence logo, using Felsensteins phylogenetic independent contrast (PIC).
Install from PyPI.org
pip3 install phyinc
It is always a good idea to use virtual environments, see below.
Install from a downloaded GitHub repository
For dependencies, start a virtual env as good practice:
> python3 -m venv .
> source venv/bin/activate
> pip3 install biopython weblogo matplotlib numpy
If you are on mac
> brew install ghostscript
To install the current package:
> pip3 install .
Usage
The basic usage is as follows.
> phyinc treefile fastafile
Since no outfile is given, the logo is output to fastafile.fa_logo.pdf.
You can decide outputfile and format using the -o option:
> phyinc -o the_logo.png treefile fastafile
Examples
There is example data in the github repository, and there you can run this command:
> phyinc examples/synthetic_data/ex1_t1.tree examples/synthetic_data/ex1.fa
This should create a PDF named "ex1.fa_seqlogo.pdf" in the examples folder.
There are cases when a phylogeny is created on proteins but the logo is
created for domains and this may cause protein accessions be something like
ETA_STAAU but the domain accession is ETA_STAAU/96-110. You can then use
the --coords option to ignore the domain coordinates when mapping the domain
to a tree.
> phyinc --coords examples/PF000672.fa examples/PF000672.treefile
However, phyinc will report an error if there more than one domain per protein:
> phyinc --coords PS00027.fa PS00027.treefile
Error: 'ZFH2_DROME' is a protein appearing twice, probably because you have two
domains from the same protein in the input. If so, you must submit a tree inferred
on the domain sequences, not on the proteins.
License
GPLv3
Authors
- Haolin Guo wrote the basic code as part of his BSc project.
- Kyle Tenn helped make the code into a Python/PyPI package.
- Lars Arvestad oversaw and helped with creating the final Python/PyPI package.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file phyinc-1.3.tar.gz.
File metadata
- Download URL: phyinc-1.3.tar.gz
- Upload date:
- Size: 25.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d6dc7359f24be082a787547173b5c05f6876e3b87aba0ca4780882e9489b65c2
|
|
| MD5 |
cbd5fc2a4960a3c153ab89bb1db23eaa
|
|
| BLAKE2b-256 |
09369c5d316dfd3cd44d7689a5beb4817604bbae08f3fb84895c65f8df716c10
|
File details
Details for the file phyinc-1.3-py3-none-any.whl.
File metadata
- Download URL: phyinc-1.3-py3-none-any.whl
- Upload date:
- Size: 22.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d249af7f4326e9c2dee975c45b2f9b9c78a6e43e00ee961c47ba27901699d86e
|
|
| MD5 |
743c975f074749368c9766497129afb0
|
|
| BLAKE2b-256 |
c2338ed9ab45b551ed7608371630541e89b70f6806aaf7e959dfce6c920c9574
|