search and summarize traits in genomes and metagenomes
Project description
traits_finder
Introduction
- traits_finder searches and summarizes traits in genomes and metagenomes
- input: reference database and folder of genomes/metagenomes
- requirement: python >= 3.0, blast or hmm
- requirement: for hmm, you need to prepare the hmm database
- optional: diamond, bwa, hs-blastn, usearch, mafft, fasttree
Install
pip install traits_finder
in preparation: anaconda download caozhichongchong/traits_finder
latest version (unstable though)
git clone https://github.com/caozhichongchong/traits_finder.git
cd traits_finder
python setup.py build
python setup.py install
Availability
https://pypi.org/project/traits_finder
What do you need to prepare
- your reference database (-db your.db), protein sequences (-dbf 1) or dna sequences (-dbf 2)
- a mapping file of functions to each reference sequence (sequence function)
- all genomes/metagenomes in a folder (-i your.input.folder)
- suffix or file extension of your genomes/metagenomes, such as .fasta or .fastq (-fa your.input.genome/metagenome.format)
- programs to run: blast for similarity search (-s 1 )or hmm for domain search (-s 2)
- optional programs to speedup! diamond, hs-blastn (or usearch for 16S extracting), usearch (necessary for HGT and HGT_sum)
- optional programs to look at sequence variants! bwa, mafft, fasttree
How to use it
Search traits: boring and slow...
Database installed in traits_finder: antibiotic resistant genes (-db ARG), butyrate producing genes (-db but)
-
search protein reference sequences in genomes (traits_finder genome) or mobile genetic elements (traits_finder mge) by similarity search
traits_finder genome -db your.db -i your.input.folder -fa your.input.genome.format --orf your.input.orf.format --r your.output.folder --r16 your.output.folder.for.16s --u diamond --bp blastp -dbf 1 -s 1
\ -
search protein reference sequences in metagenomes by similarity search
traits_finder meta -db your.db -i your.input.folder -fa your.input.metagenomes.format --r your.output.folder --r16 your.output.folder.for.16s --u diamond --bp blastp -dbf 1 -s 1
\ -
search dna reference sequences in genomes by similarity search
traits_finder genome -db your.db -i your.input.folder -fa your.input.genome.format --orf your.input.orf.format --r your.output.folder --r16 your.output.folder.for.16s --u usearch.or.hs-blastn --bp blastn -dbf 2 -s 1
\ -
search dba reference sequences in metagenomes by similarity search
traits_finder meta -db your.db -i your.input.folder -fa your.input.metagenomes.format --r your.output.folder --r16 your.output.folder.for.16s --u usearch.or.hs-blastn --bp blastn -dbf 2 -s 1
\ -
search protein reference sequences in genomes by hmm
traits_finder genome -db your.db -i your.input.folder -fa your.input.genome.format --orf your.input.orf.format --r your.output.folder --r16 your.output.folder.for.16s --hmm hmmscan -dbf 1 -s 2
\ -
search dna reference sequences in genomes by alignment
traits_finder genome -db your.db -i your.input.folder -fa your.input.genome.format --orf your.input.orf.format --r your.output.folder --r16 your.output.folder.for.16s --u usearch.or.hs-blastn --bp blastn --bwa bwa -dbf 2 -s 1
\ -
search dna reference sequences in metagenomes by alignment
traits_finder meta -db your.db -i your.input.folder -fa your.input.metagenomes.format --r your.output.folder --r16 your.output.folder.for.16s --u usearch.or.hs-blastn --bp blastn --bwa bwa -dbf 2 -s 1
\
Summarize results: cool and fast!
-
summarize traits in genome
traits_finder sum_genome -db your.db -m function.mapping.your.db -i your.input.folder -fa your.input.genome.format --orf your.input.orf.format --r your.output.folder --r16 your.output.folder.for.16s
\ -
summarize traits in metagenomes
traits_finder sum_meta -db your.db -m function.mapping.your.db -i your.input.folder -fa your.input.metagenomes.format --r your.output.folder --r16 your.output.folder.for.16s --meta metadata.txt
\
HGT finder and summarizing: cool and fast! (still-testing)
Results
Copyright
Copyright: An Ni Zhang, Prof. Eric Alm, Alm Lab in MIT
Citation: Not yet, coming soon!
Contact: anniz44@mit.edu
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file traits_finder-1.6.tar.gz
.
File metadata
- Download URL: traits_finder-1.6.tar.gz
- Upload date:
- Size: 4.7 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.8.0 tqdm/4.32.2 CPython/3.6.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ea30a6745418abfd7814942053cc84d4700e2b0bdc3770044688fcc14d03f91f |
|
MD5 | 2a434fa461dfe4beb53ebc7eaf1464b4 |
|
BLAKE2b-256 | b2371127f8b62abf4957ab78fc410d10a516f468a6557cd815f438528686b734 |
File details
Details for the file traits_finder-1.6-py3.6.egg
.
File metadata
- Download URL: traits_finder-1.6-py3.6.egg
- Upload date:
- Size: 5.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.8.0 tqdm/4.32.2 CPython/3.6.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 24a217d362ed7eabf4aebb72214a71863c3dc6089e647e15cbec75a13285b340 |
|
MD5 | f79baef6ec513e1e7a87e9980ad26470 |
|
BLAKE2b-256 | ff081e8cf043964ee7a7037def01e71321ba3358926bb57d7d96d6bd4f6797e0 |