phylopandas

Pandas for phylogenetics

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

# PhyloPandas #

**Bringing the [Pandas](https://github.com/pandas-dev/pandas) `DataFrame` to phylogenetics.**

PhyloPandas provides a Pandas-like interface for reading various sequence formats into DataFrames. This enables easy manipulation of phylogenetic data using familiar Python/Pandas functions. Finally, phylogenetics for humans!

<img src='docs/_images/jlab.png' align="middle">

## How does it work?

Don't worry, we didn't reinvent the wheel. **PhyloPandas** is simply a [DataFrame](https://github.com/pandas-dev/pandas)
(great for human-accessible data storage) interface on top of [Biopython](https://github.com/biopython/biopython) (great for parsing/writing sequence data).

When you import PhyloPandas, you import Pandas with a PhyloPandas flavor. That means, the usual `read_` functions
are available ('read_csv', 'read_excel', etc.), but the returned DataFrame includes extra `to_` methods (`to_fasta`, `to_phylip`, etc.)

## Basic Usage

1. Read any format:
```python
import phylopandas as pd

df1 = pd.read_fasta('sequences.fasta')
df2 = pd.read_phylip('sequences.phy')
```
2. Write any format:
```python
df1.to_clustal('sequences.clustal')
```
3. Convert formats:
```python
df = phypd.read_fasta('sequences.fasta')
df.to_phylip('sequences.phy')
```
4. Merge two **ordered** sequence files (like raw sequence file and its alignment).
```python
# Read sequence file into dataframe
df = pd.read_fasta('sequences.fasta')

# Read alignment into dataframe
align = pd.read_fasta('alignment.fasta')

# Add alignment using standard pandas functions
# NOTE: this assumes the alignment and sequence
# file are ordered.
df = df.assign(alignment=align['sequence'])
```
5. Write out alignment in last example.
```python
df.to_fasta('new_alignment.fasta', sequence_col='alignment')
```

## Contributing

It's *easy* to create new read/write functions and methods for PhyloPandas. If you
have a format you'd like to add, please submit PRs! There are many more formats
in Biopython that I haven't had the time to add myself, so please don't be afraid
to add them! I thank you ahead of time!

## Testing

PhyloPandas includes a small [pytest]() suite. Run these tests from base directory.
```
$ cd phylopandas
$ pytest
```

## Install

Install from PyPi:
```
pip install phylopandas
```

Install from source:

```
git clone https://github.com/Zsailer/phylopandas
cd phylopandas
pip install -e .
```

## Dependencies

* [BioPython](https://github.com/biopython/biopython): Library for managing and manipulating biological data.
* [Pandas](https://github.com/pandas-dev/pandas): Flexible and powerful data analysis / manipulation library for Python

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.8.0

Sep 26, 2019

0.7.4

Jul 3, 2019

0.7.3

May 9, 2019

0.7.2

Nov 6, 2018

0.7.1

Jul 11, 2018

0.6.0

Apr 25, 2018

0.5.0

Feb 9, 2018

0.4.0

Feb 3, 2018

0.1.4

Dec 25, 2017

This version

0.1.3

Nov 2, 2017

0.1.2

Nov 2, 2017

0.1.1

Oct 25, 2017

0.1

Oct 24, 2017

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

phylopandas-0.1.3.tar.gz (5.1 kB view hashes)

Uploaded Nov 2, 2017 Source

Built Distribution

phylopandas-0.1.3-py2.py3-none-any.whl (7.8 kB view hashes)

Uploaded Nov 2, 2017 Python 2 Python 3

Hashes for phylopandas-0.1.3.tar.gz

Hashes for phylopandas-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`1378d8112d7a903e621edeb84dc5ce5a7591ca760caedcc8eb801ffa5fd29b5a`
MD5	`fa16c707c23be054d1204b42dafe5840`
BLAKE2b-256	`83847ae6b85a786fb5bf36cff853ee0d5fe8a6eedc478c110dab588b71d30d61`

Hashes for phylopandas-0.1.3-py2.py3-none-any.whl

Hashes for phylopandas-0.1.3-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`183c4865a34215b51ddff9f9734fd28b00afda48a35abb57930cd33457096a1f`
MD5	`92a070206974d53ab15a17f3427f6b2a`
BLAKE2b-256	`531232de60bd4c0dac0c4c35033cc5e08355a3d84cfcfe536fe4a79412f66652`