Feature-aware Directed OrtholoG search tool

These details have not been verified by PyPI

Project links

Homepage

Environment
- Console
Intended Audience
- End Users/Desktop
License
- OSI Approved :: GNU General Public License v3 or later (GPLv3+)
Natural Language
- English
Programming Language
- Python :: 3

Project description

fDOG - Feature-aware Directed OrtholoG search

How to install
- Install the fDOG package
- Setup fDOG
Usage
fDOG data set
- Adding a new gene set into fDOG
- Adding a list of gene sets into fDOG
Bugs
How to cite
Contributors
Contact

How to install

fDOG tool is distributed as a python package called fdog. It is compatible with Python ≥ v3.7.

Install the fDOG package

You can install fdog using pip:

python3 -m pip install fdog

or, in case you do not have admin rights, and don't use package systems like Anaconda to manage environments you need to use the --user option:

python3 -m pip install --user fdog

and then add the following line to the end of your ~/.bashrc or ~/.bash_profile file, restart the current terminal to apply the change (or type source ~/.bashrc):

export PATH=$HOME/.local/bin:$PATH

Setup fDOG

After installing fdog, you need to setup fdog to get its dependencies and pre-calculated data.

You can do it by just running this command

fdog.setup -o /output/path/for/fdog/data

or, in case you are using Anaconda

fdog.setup -o /output/path/for/fdog/data --conda

You should have the sudo password ready, otherwise some missing dependencies cannot be installed. See dependency list for more info. If you do not have root privileges, ask your admin to install those dependencies using fdog.setup --lib command.

Pre-calculated data set of fdog will be saved in /output/path/for/fdog/data. After the setup run successfully, you can start using fdog.

For debugging the setup, please create a log file by running the setup as e.g. fdog.setup | tee log.txt for Linux/MacOS or fdog.setup --conda | tee log.txt for Anaconda and send us that log file, so that we can trouble shoot the issues. Most of the problems can be solved by just re-running the setup.

Usage

fdog will run smoothly with the provided sample input file 'infile.fa' if everything is set correctly.

fdog.run --seqFile infile.fa --seqName test --refspec HUMAN@9606@3

The output files with the prefix test will be saved at your current working directory. You can have an overview about all available options with the command

fdog.run -h

Please find more information in our wiki to learn about the input and outputs files of fdog.

fDOG data set

Within the data package we provide a set of 78 reference taxa. They can be automatically downloaded during the setup. This data comes "ready to use" with the fdog framework. Species data must be present in the three directories listed below:

genome_dir (Contains sub-directories for proteome fasta files for each species)
blast_dir (Contains sub-directories for BLAST databases made with makeblastdb out of your proteomes)
weight_dir (Contains feature annotation files for each proteome)

For each species/taxon there is a sub-directory named in accordance to the naming schema ([Species acronym]@[NCBI ID]@[Proteome version])

fdog is not limited to those 78 taxa. If needed the user can manually add further gene sets (multiple fasta format) using provided functions.

Adding a new gene set into fDOG

For adding one gene set, please use the fdog.addTaxon function:

fdog.addTaxon -f newTaxon.fa -i tax_id [-o /output/directory] [-n abbr_tax_name] [-c] [-v protein_version] [-a]

in which, the first 3 arguments are required including newTaxon.fa is the gene set that need to be added, tax_id is its NCBI taxonomy ID, /output/directory is where the sub-directories can be found (genome_dir, blast_dir and weight_dir). If not given, new taxon will be added into the same directory of pre-calculated data. Other arguments are optional, which are -n for specify your own taxon name (if not given, an abbriviate name will be suggested based on the NCBI taxon name of the input tax_id), -c for calculating the BLAST DB (only needed if you need to include your new taxon into the list of taxa for compilating the core set), -v for identifying the genome/proteome version (default will be 1), and -a for turning off the annotation step (not recommended).

Adding a list of gene sets into fDOG

For adding more than one gene set, please use the fdog.addTaxa script:

fdog.addTaxa -i /path/to/newtaxa/fasta -m mapping_file [-o /output/directory] [-c]

in which, /path/to/taxa/fasta is a folder where the FASTA files of all new taxa can be found. mapping_file is a tab-delimited text file, where you provide the taxonomy IDs that stick with the FASTA files:

#filename	tax_id	abbr_tax_name	version
filename1.fa	12345678
filename2.faa	9606
filename3.fasta	4932	my_fungi
...

The header line (started with #) is a Must. The values of the last 2 columns (abbr. taxon name and genome version) are, however, optional. If you want to specify a new version for a genome, you need to define also the abbr. taxon name, so that the genome version is always at the 4th column in the mapping file.

NOTE: After adding new taxa into fdog, you should check for the validity of the new data before running fdog.

Bugs

Any bug reports or comments, suggestions are highly appreciated. Please open an issue on GitHub or be in touch via email.

How to cite

Ebersberger, I., Strauss, S. & von Haeseler, A. HaMStR: Profile hidden markov model based search for orthologs in ESTs. BMC Evol Biol 9, 157 (2009), doi:10.1186/1471-2148-9-157

Contributors

Contact

For further support or bug reports please contact: ebersberger@bio.uni-frankfurt.de

Project details

These details have not been verified by PyPI

Project links

Homepage

Environment
- Console
Intended Audience
- End Users/Desktop
License
- OSI Approved :: GNU General Public License v3 or later (GPLv3+)
Natural Language
- English
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

0.1.33

Oct 28, 2024

0.1.32

Jun 6, 2024

0.1.31

Jun 3, 2024

0.1.30

Mar 14, 2024

0.1.29

Mar 12, 2024

0.1.28

Feb 28, 2024

0.1.27

Feb 12, 2024

0.1.26

Dec 22, 2023

0.1.25

Dec 20, 2023

0.1.24

Sep 19, 2023

0.1.23

Aug 16, 2023

0.1.22

Aug 16, 2023

0.1.21

Aug 16, 2023

0.1.20

Aug 16, 2023

0.1.19

Aug 15, 2023

0.1.17

Jul 18, 2023

0.1.16

Jun 15, 2023

0.1.15

Jun 12, 2023

0.1.14

May 25, 2023

0.1.13

May 11, 2023

0.1.12

Mar 10, 2023

0.1.11

Feb 22, 2023

0.1.10

Feb 20, 2023

0.1.9

Feb 17, 2023

0.1.8

Feb 14, 2023

0.1.7

Feb 6, 2023

0.1.5

Feb 2, 2023

0.1.3

Jan 27, 2023

0.1.2

Jan 25, 2023

0.1.1

Jan 24, 2023

0.1.0

Jan 24, 2023

0.0.53

Oct 12, 2022

0.0.52

Mar 15, 2022

0.0.51

Mar 10, 2022

0.0.50

Jan 12, 2022

0.0.48

Nov 30, 2021

0.0.47

Oct 25, 2021

0.0.46

Oct 22, 2021

0.0.45

Jun 16, 2021

0.0.44

Jun 11, 2021

0.0.43

Jun 1, 2021

0.0.40

May 31, 2021

0.0.39

Apr 29, 2021

0.0.38

Apr 23, 2021

0.0.37

Apr 21, 2021

0.0.36

Apr 11, 2021

0.0.35

Apr 11, 2021

0.0.34

Apr 1, 2021

0.0.33

Mar 29, 2021

0.0.32

Mar 24, 2021

0.0.31

Mar 24, 2021

0.0.30

Mar 19, 2021

0.0.29

Mar 19, 2021

0.0.27

Mar 19, 2021

0.0.26

Mar 16, 2021

0.0.25

Feb 23, 2021

0.0.24

Feb 22, 2021

0.0.22

Feb 19, 2021

0.0.21

Feb 16, 2021

0.0.20

Feb 5, 2021

0.0.18

Jan 28, 2021

0.0.17

Jan 27, 2021

0.0.16

Jan 21, 2021

0.0.15

Jan 14, 2021

0.0.13

Jan 11, 2021

0.0.12

Dec 2, 2020

0.0.11

Dec 1, 2020

0.0.10

Nov 26, 2020

0.0.9

Nov 15, 2020

0.0.8

Oct 5, 2020

0.0.7

Sep 25, 2020

0.0.6

Sep 25, 2020

This version

0.0.5

Sep 24, 2020

0.0.4

Sep 24, 2020

0.0.3

Sep 24, 2020

0.0.2

Sep 24, 2020

0.0.1

Sep 24, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fdog-0.0.5.tar.gz (87.5 kB view hashes)

Uploaded Sep 24, 2020 Source

Hashes for fdog-0.0.5.tar.gz

Hashes for fdog-0.0.5.tar.gz
Algorithm	Hash digest
SHA256	`db3852f528293c9113183928efcf676bf69540bc24d5b2a428ef025787c46e16`
MD5	`e1952a66dd9d8a747b1d762ae0e7d670`
BLAKE2b-256	`ccceb9e602ff1a4ab7d41f5067a4c0371929c26b5fbf4f4a73fb1599ea656540`