Skip to main content

Sequence Elements Enrichment Analysis (SEER), python implementation

Project description

[SEER](https://github.com/johnlees/seer), reimplemented in python by [Marco Galardini](https://github.com/mgalardini) and [John Lees](https://github.com/johnlees)

pyseer –phenotypes phenotypes.tsv –kmers kmers.gz –distances structure.tsv –min-af 0.01 –max-af 0.99 –cpu 15 –filter-pvalue 1E-8

[![Build Status](https://travis-ci.org/mgalardini/pyseer.svg?branch=master)](https://travis-ci.org/mgalardini/pyseer) [![Documentation Status](https://readthedocs.org/projects/pyseer/badge/?version=master)](http://pyseer.readthedocs.io/) [![PyPI version](https://badge.fury.io/py/pyseer.svg)](https://pypi.org/project/pyseer/) [![Anaconda package](https://anaconda.org/bioconda/pyseer/badges/version.svg)](https://anaconda.org/bioconda/pyseer)

Motivation

Kmers-based GWAS analysis is particularly well suited for bacterial samples, given their high genetic variability. This approach has been implemented by [Lees, Vehkala et al.](https://www.nature.com/articles/ncomms12797), in the form of the [SEER](https://github.com/johnlees/seer) software.

The reimplementation presented here should be consistent with the current version of the C++ seer (though we do not guarantee this for all possible cases).

In this version, as well as all the original features, many new features (input types, association models and output parsing) have been implemented. See the [documentation](http://pyseer.readthedocs.io/) and [paper](https://doi.org/10.1093/bioinformatics/bty539) for full details.

Citation

Lees, John A., Galardini, M., et al. "pyseer: a comprehensive tool for microbial pangenome-wide association studies". Bioinformatics 34:4310–4312 (2018).

doi: 10.1093/bioinformatics/bty539

Lees, John A., et al. "Sequence element enrichment analysis to determine the genetic basis of bacterial phenotypes." Nature communications 7:12797 (2016).

doi: 10.1038/ncomms12797

Documentation

Full documentation is available at [readthedocs](http://pyseer.readthedocs.io/).

You can also build the docs locally (requires sphinx) by typing:

cd docs/ make html

Prerequisites

Between parenthesis the versions the script was tested against:

  • python 3+ (3.6.6)
  • numpy (1.15.2)
  • scipy (1.1.0)
  • pandas (0.23.4)
  • scikit-learn (0.20.0)
  • statsmodels (0.9.0)
  • pysam (0.15.1)
  • glmnet_python (commit 946b65c)
  • DendroPy (4.4.0)

If you would like to use the scree_plot_pyseer script you will also need to have matplotlib installed. If you would like to use the scripts to map and annotate kmers, you will also need bwa, bedtools, bedops and pybedtools installed.

Installation

The easiest way to install pyseer and its dependencies is through conda:

conda install pyseer

If you need conda, download [miniconda](https://conda.io/miniconda.html) and add the necessary channels:

conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge

pyseer can also be installed through pip:

python -m pip install pyseer

If you want multithreading make sure that you are using a version 3 python interpreter:

`python3 -m pip install pyseer`

If you want the next pre-release, just clone/download this repository and run:

python pyseer-runner.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
pyseer-1.3.0.tar.gz (52.5 kB) Copy SHA256 hash SHA256 Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page