Skip to main content

Pandas for phylogenetics

Project description

Gitter chat Documentation Status Build Status Binder

Bringing the Pandas DataFrame to phylogenetics.

PhyloPandas provides a Pandas-like interface for reading sequence and phylogenetic tree data into pandas DataFrames. This enables easy manipulation of phylogenetic data using familiar Python/Pandas functions. Finally, phylogenetics for humans!

How does it work?

Don't worry, we didn't reinvent the wheel. PhyloPandas is simply a DataFrame (great for human-accessible data storage) interface on top of Biopython (great for parsing/writing sequence data) and DendroPy (great for reading tree data).

PhyloPandas does two things:

  1. It offers new read functions to read sequence/tree data directly into a DataFrame.
  2. It attaches a new phylo accessor to the Pandas DataFrame. This accessor provides writing methods for sequencing/tree data (powered by Biopython and dendropy).

Basic Usage

Sequence data:

Read in a sequence file.

import phylopandas as ph

df1 = ph.read_fasta('sequences.fasta')
df2 = ph.read_phylip('sequences.phy')

Write to various sequence file formats.

df1.phylo.to_clustal('sequences.clustal')

Convert between formats.

# Read a format.
df = ph.read_fasta('sequences.fasta')

# Write to a different format.
df.phylo.to_phylip('sequences.phy')

Tree data:

Read newick tree data

df = ph.read_newick('tree.newick')

Visualize the phylogenetic data (powered by phylovega).

df.phylo.display(
    height=500,
)

Contributing

If you have ideas for the project, please share them on the project's Gitter chat.

It's easy to create new read/write functions and methods for PhyloPandas. If you have a format you'd like to add, please submit PRs! There are many more formats in Biopython that I haven't had the time to add myself, so please don't be afraid to add them! I thank you ahead of time!

Testing

PhyloPandas includes a small pytest suite. Run these tests from base directory.

$ cd phylopandas
$ pytest

Install

Install from PyPI:

pip install phylopandas

Install from source:

git clone https://github.com/Zsailer/phylopandas
cd phylopandas
pip install -e .

Dependencies

  • BioPython: Library for managing and manipulating biological data.
  • DendroPy: Library for phylogenetic scripting, simulation, data processing and manipulation
  • Pandas: Flexible and powerful data analysis / manipulation library for Python
  • pandas_flavor: Flavor pandas objects with new accessors using pandas' new register API (with backwards compatibility).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

phylopandas-0.8.0.tar.gz (14.1 kB view details)

Uploaded Source

Built Distribution

phylopandas-0.8.0-py2.py3-none-any.whl (24.5 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file phylopandas-0.8.0.tar.gz.

File metadata

  • Download URL: phylopandas-0.8.0.tar.gz
  • Upload date:
  • Size: 14.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for phylopandas-0.8.0.tar.gz
Algorithm Hash digest
SHA256 1efc4b81ce745794490f6f6144114f1dc8102764303c58935017119cdcaaa7d2
MD5 04ec1c1d106fe515e88329af6ddbe44d
BLAKE2b-256 4ad83eecd18d4b995b6bd9d8488c34731f3654093b08158a8555df7098df6494

See more details on using hashes here.

File details

Details for the file phylopandas-0.8.0-py2.py3-none-any.whl.

File metadata

  • Download URL: phylopandas-0.8.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 24.5 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for phylopandas-0.8.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 9f517270296731934ab9285067735401b531571b19103904a63ea5e3cae29200
MD5 90b1cc8e3750e88f8bd0928c8eab5d2c
BLAKE2b-256 b50a3341f46b96425a0e5e7b64e8052b9785bbfdfbeac84426b13ccc2712a5a8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page