Skip to main content

EMPANADA: a tool for evidence-based assignment of genes to pathways in metagenomic data

Project description

EMPANADA Documentation

EMPANADA is a tool for evidence-based assignment of genes to pathways in metagenomic data,
developed and maintained by the Borenstein group at the University of Washington.


EMPANADA is available as a Python module from GitHub or PyPI (see installation instructions below)


EMPANADA is distributed under a non-commercial license (see LICENSE).

Installation Instructions

Prerequisites for installing:

In order for EMPANADA to run successfully, the following Python modules should be pre-installed on your system:

- Numpy >= 1.6.1 (
- Pandas >= 0.14 (

If you have *pip* installed, you can install these packages by running the following command:

``pip install -U numpy pandas``

**Installing EMPANADA:**

To install EMPANADA, download the package from

After downloading EMPANADA, you’ll need to unzip the file. If you’ve downloaded the release version, do this with the following command:

``tar -xzf empanada-0.0.1.tar.gz``

You’ll then change into the new EMPANADA directory as follows:

``cd empanada-0.0.1``

and install using the following command:

``python install``

ALTERNATIVELY, you can install EMPANADA directly from PyPI by running:

``pip install -U empanada``

Testing the software package

After downloading and installing the software, we recommend testing it by running the following command:


This will invoke a series of tests. A correct output should end with:

``Ran 1 tests in X.XXXXs``


EMPANADA API via the command line
The EMPANADA module handles all calculations internally.
EMPANADA offers an interface to the EMPANADA functionality via the command line and the run_empanada script.


`` -ko KO_ABUNDANCE_FILE -ko2path KO_TO_PATHWAY_FILE [options]``

Required arguments:

Input KO abundance file to aggregate to pathway abundance

**-ko2path KO_TO_PATHWAY_FILE**
Input file of KO-to-pathway mapping

Optional arguments:

**-h, --help**
show help message and exit

**-o, --output**,
Output file for resulting pathway abundance (default:

**-oc, --output_counts**,
Output file for number of KOs mapped to each pathway (default:

**-om, --output_mapping**,
Output the mapping table (either given or generated) to file, works only with pooled mappings (default:

**-map {naive, by_support, by_sum_abundance, by_avg_abundance}, --mapping_method {naive, by_support, by_sum_abundance, by_avg_abundance}**
Method to map KOs to Pathway (default: naive)

**-compute {sum}, --compute_method {sum}**
Method to compute pathway abundance from mapped KOs (default: sum)

**-threshold, --abundance_threshold**
Abundance threshold to include KOs (default: 0.0)

**-fraction, --fractional_ko_contribution**
Divide KO contributions such that they sum to 1 for each KO (default: False)

Remove KOs with no pathway from analysis (default: False)

Remove KOs with no measurements in the abundance table from analysis (default: False)

**-transpose_ko, --transpose_ko_abundance**
Transpose the ko abundance matrix given (default: False)

**-transpose_output, --transpose_output**
Transpose the output pathway abundance matrix (default: False)

Permute the given KO mapping, i.e., which KO map to which pathways for hypothesis testing (default: False)

If the mapping is by_abundance, compute pathway support by only using non-overlapping genes (default: False)

If the mapping is by_abundance, pool samples together using the median KO abundance, and learn the mapping only once (default: False)

If the mapping is by_abundance, pool samples together using the average KO abundance, and learn the mapping only once (default: False)

If the mapping is by_abundance, compute pathway support for each KO separately by removing it from the computation (default: False)

If the mapping is by_abundance, double count KO abundance (weighted by mapping) when computing pathway support (default: False)

**-v, --verbose**
Increase verbosity of module (default: False)


In the *empanada/examples* directory, the file ** contains simulated KO abundance measurements of 20 samples.
Using this file as input for EMPANADA results in the following files:


The command used are the following (via command line):

`` -ko examples/ -ko2path data/ -o examples/ -threshold 0 -map by_avg_abundance -fraction -leave_one_ko_out_pathway_support -use_only_non_overlapping_genes``

Citing Information

If you use the EMPANADA software, please cite the following paper:

Functional variability in the human microbiome: More than meets the eye
**Ohad Manor and Elhanan Borenstein.** *In preparation*


0.0.1 (9 February, 2016)
* Initial release of beta version


EMPANADA is written and maintained by Ohad Manor and the Borenstein group in University of Washington.

EMPANADA Software License Agreement

EMPANADA (C) 2014-2016, University of Washington. All rights reserved.

Subject to the terms below, the University of Washington ("UW"), Professor Elhanan Borenstein, and Ohad Manor ("Developer(s)") give permission for you and other members of your laboratory for as long as they remain members ("Academic User(s)"), such permission granted solely to Academic Users in a nonprofit institution of higher education or a nonprofit research institution ("University"), to use EMPANADA solely as further detailed below. EMPANADA is a tool for evidence-based assignment of genes to pathways in metagenomic data. EMPANADA is protected by a copyright. The National Institutes of Health supported work on EMPANADA. The UW and the Developers allow Academic Users to perform, copy, and modify EMPANADA, solely for internal, non-profit academic research purposes, and as long as Academic Users comply with the terms of this EMPANADA Software License Agreement:

1. EMPANADA is not used for any commercial purposes, or as part of a system which has commercial purposes. The EMPANADA software remains at your University and is not published, distributed, or otherwise transferred or made available to other than Academic Users.

2. You may not distribute EMPANADA or any modification to EMPANADA to any third party.

If you wish to obtain EMPANADA software for any commercial purposes, you will need to contact the University of Washington to see if rights are available and to negotiate a commercial license and pay a fee among other requirements. This includes, but is not limited to, using EMPANADA to provide services to outside parties for a fee. In that case please contact:

UW CoMotion
University of Washington
4311 11th Ave. NE,
Suite 500 Seattle, WA 98105-4608
Phone: (206) 543-3970

3. You retain in EMPANADA and any modifications to EMPANADA, the copyright, trademark, patent or other notices pertaining to EMPANADA as provided by UW.

4. You acknowledge that the Developers, the UW and its licensees may develop modifications to EMPANADA that may be substantially similar to your modifications of EMPANADA, and that the Developers, UW and its licensees shall not be constrained in any way by you in UW's or its licensees' use or management of such modifications. You acknowledge the right of the Developers and UW to prepare and publish modifications to EMPANADA that may be substantially similar or functionally equivalent to your modifications and improvements, and if you obtain patent protection for any modification or improvement to EMPANADA you agree not to allege or enjoin infringement of your patent by the Developers, the UW or by any of UW's licensees obtaining modifications or improvements to EMPANADA from the University of Washington or the Developers.

5. If utilization of the EMPANADA software results in outcomes which will be published, you will specify the version of EMPANADA you used and cite the UW Developers.

6. Any risk associated with using the EMPANADA software at your organization is with you and your organization. EMPANADA is experimental in nature and is made available as a research courtesy "AS IS," expressly without any obligation by UW to provide accompanying services or support.


8. This Software License Agreement and all rights granted under it terminate on December 31st, 2020. Upon termination, you agree to remove so as to make unrecoverable the original EMPANADA software, all copies and all modifications thereof.

Project details

Release history Release notifications

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
empanada-0.0.2.tar.gz (470.9 kB) Copy SHA256 hash SHA256 Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page