search and summarize traits in genomes and metagenomes
Project description
traits_finder
Introduction
- traits_finder searches and summarizes traits in genomes and metagenomes
- input: reference database and folder of genomes/metagenomes
- requirement: blast or hmm
- requirement: for hmm, you need to prepare the hmm database
- optional: diamond, bwa, hs-blastn, usearch
Install
pip install traits_finder
in preparation: anaconda download caozhichongchong/traits_finder
Availability
https://pypi.org/project/traits_finder
What do you need to prepare
- your reference database (-db your.db), protein sequences (-dbf 1) or dna sequences (-dbf 2)
- a mapping file of functions to each reference sequence (sequence function)
- all genomes/metagenomes in a folder (-i your.input.folder)
- suffix or file extension of your genomes/metagenomes, such as .fasta or .fastq (-fa your.input.genome/metagenome.format)
- programs to run: blast for similarity search (-s 1 )or hmm for domain search (-s 2)
- optional programs to speedup! diamond, hs-blastn, usearch
- optional programs to look at sequence variants! bwa
How to use it
Search traits: boring and slow...
-
search protein reference sequences in genomes by similarity search
traits_finder genome -db your.db -i your.input.folder -fa your.input.genome.format --orf your.input.orf.format --r your.output.folder --r16 your.output.folder.for.16s --u diamond --bp blastp -dbf 1 -s 1
\ -
search protein reference sequences in metagenomes by similarity search
traits_finder meta -db your.db -i your.input.folder -fa your.input.metagenomes.format --r your.output.folder --r16 your.output.folder.for.16s --u diamond --bp blastp -dbf 1 -s 1
\ -
search dna reference sequences in genomes by similarity search
traits_finder genome -db your.db -i your.input.folder -fa your.input.genome.format --orf your.input.orf.format --r your.output.folder --r16 your.output.folder.for.16s --u usearch.or.hs-blastn --bp blastn -dbf 2 -s 1
\ -
search dba reference sequences in metagenomes by similarity search
traits_finder meta -db your.db -i your.input.folder -fa your.input.metagenomes.format --r your.output.folder --r16 your.output.folder.for.16s --u usearch.or.hs-blastn --bp blastn -dbf 2 -s 1
\ -
search protein reference sequences in genomes by hmm
traits_finder genome -db your.db -i your.input.folder -fa your.input.genome.format --orf your.input.orf.format --r your.output.folder --r16 your.output.folder.for.16s --hmm hmmscan -dbf 1 -s 2
\ -
search dna reference sequences in genomes by alignment
traits_finder genome -db your.db -i your.input.folder -fa your.input.genome.format --orf your.input.orf.format --r your.output.folder --r16 your.output.folder.for.16s --u usearch.or.hs-blastn --bp blastn --bwa bwa -dbf 2 -s 1
\ -
search dna reference sequences in metagenomes by alignment
traits_finder meta -db your.db -i your.input.folder -fa your.input.metagenomes.format --r your.output.folder --r16 your.output.folder.for.16s --u usearch.or.hs-blastn --bp blastn --bwa bwa -dbf 2 -s 1
\
Summarize results: cool and fast!
-
summarize traits in genome
traits_finder sum_genome -db your.db -m function.mapping.your.db -i your.input.folder -fa your.input.genome.format --orf your.input.orf.format --r your.output.folder --r16 your.output.folder.for.16s
\ -
summarize traits in metagenomes
traits_finder sum_meta -db your.db -m function.mapping.your.db -i your.input.folder -fa your.input.metagenomes.format --r your.output.folder --r16 your.output.folder.for.16s
\
Results
Copyright
Copyright: An Ni Zhang, Prof. Eric Alm, Alm Lab in MIT
Citation: Not yet, coming soon!
Contact: anniz44@mit.edu
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.