Skip to main content

Python package for fCAT, a feature-aware completeness assessment tool

Project description

fCAT

PyPI version License: GPL v3 Build Status Github Build

Python package for fCAT, a feature-aware Completeness Assessment Tool

Table of Contents

How to install

fCAT tool is distributed as a python package called fcat. It is compatible with Python ≥ v3.7.

You can install fcat using pip:

python3 -m pip install fcat

or, in case you do not have admin rights, and don't use package systems like Anaconda to manage environments you need to use the --user option:

python3 -m pip install --user fcat

and then add the following line to the end of your ~/.bashrc or ~/.bash_profile file, restart the current terminal to apply the change (or type source ~/.bashrc):

export PATH=$HOME/.local/bin:$PATH

Usage

The complete process of fCAT can be done using one function fcat

fcat --coreDir /path/to/fcat_data --coreSet eukaryota --refspecList "HOMSA@9606@2" --querySpecies /path/to/query.fa [--annoQuery /path/to/query.json] [--outDir /path/to/fcat/output]

where eukaryota is name of the fCAT core set (equivalent to BUSCO set); HOMSA@9606@2 is the reference species from that core set that will be used for the ortholog search; query is the name of species of interest. If --annoQuery not specified, fCAT fill do the feature annotation for the query proteins using FAS tool.

Output

You will find the output in the /path/to/fcat/output/fcatOutput/eukaryota/query/ folder, where /path/to/fcat/output/ could be your current directory if you not specified --outDir when running fcat. The following important output files/folders can be found:

  • all_summary.txt: summary of the completeness assessment using all 4 score modes
  • all_full.txt: the complete assessment of 4 score modes in tab delimited file
  • fdogOutput.tar.gz: a zipped file of the ortholog search result
  • mode_1, mode_2, mode_3 and mode_4: detailed output for each score mode
  • phyloprofileOutput: folder contains output phylogenetic profile data that can be used with PhyloProfile tool

Besides, if you have already run fCAT for several query taxa with the same fCAT core set, you can find the merged phylogentic profiles for all of those taxa within the corresponding core set output (e.g. /path/to/fcat/output/fcatOutput/eukaryota/*.phyloprofile).

fCAT score modes

The table below explains how the specific ortholog group cutoffs for each fCAT core set were calculated, and which value of the query ortholog is used to assess its completeness, or more precisely, its functional equivalence to the ortholog group it belongs to. If the value of a query ortholog is not less than its ortholog group cutoff, that group will be evaluated as similar or complete. In case co-orthologs have been predicted, the assessment for the core group will be duplicated. Depending on the value of each single ortholog, a duplicated group can be seen as duplicated (similar) or duplicated (dissimilar) in the full report (e.g. all_full.txt).

Score mode Cutoff Value
Mode 1 Mean of FAS scores between all core orthologs Mean of FAS scores between query ortholog and all core proteins
Mode 2 Mean of FAS scores between refspec and all other core orthologs Mean of FAS scores between query ortholog and refspec protein
Mode 3 The lower bound of the confidence interval calculated by the distribution of all-vs-all FAS score in a core group Mean of FAS scores between query ortholog and refspec protein
Mode 4 Mean and standard deviation of all core protein lengths Length of query ortholog

Note: FAS scores are bidirectional FAS scors; core protein or core ortholog is protein in the core ortholog groups; query protein or query ortholog is ortholog protein of query species; refspec is the specified reference species

Bugs

Any bug reports or comments, suggestions are highly appreciated. Please open an issue on GitHub or be in touch via email.

Contributors

Contact

For further support or bug reports please contact: tran@bio.uni-frankfurt.de

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fcat-0.0.37.tar.gz (35.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fcat-0.0.37-py3-none-any.whl (41.3 kB view details)

Uploaded Python 3

File details

Details for the file fcat-0.0.37.tar.gz.

File metadata

  • Download URL: fcat-0.0.37.tar.gz
  • Upload date:
  • Size: 35.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.6.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.8.10

File hashes

Hashes for fcat-0.0.37.tar.gz
Algorithm Hash digest
SHA256 e329e53c49b8fc982a7545cdc0067287a28702e26b6fa80992a5f24f079f7155
MD5 af86c205a350aaf92bc39b764d6b6aa1
BLAKE2b-256 b37fe5ebbdc9ee30e0b559e4266750b425d55b056cb5097237c0fe1758befde4

See more details on using hashes here.

File details

Details for the file fcat-0.0.37-py3-none-any.whl.

File metadata

  • Download URL: fcat-0.0.37-py3-none-any.whl
  • Upload date:
  • Size: 41.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.6.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.8.10

File hashes

Hashes for fcat-0.0.37-py3-none-any.whl
Algorithm Hash digest
SHA256 eb0993ac844d90d11c6c214644b688f0b54018bd3036f2c01455d20f036b10b6
MD5 2e5301fb22748a2b4a31e4e1606e4934
BLAKE2b-256 75344611ba7cb9f820c8b21bb65f96cd06b7af8ebda60545f390b89c7176f187

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page