Skip to main content
Join the official 2019 Python Developers SurveyStart the survey!

search and summarize traits in genomes and metagenomes

Project description

traits_finder

Introduction

  • traits_finder searches and summarizes traits in genomes and metagenomes
  • input: reference database and folder of genomes/metagenomes
  • requirement: blast or hmm
  • requirement: for hmm, you need to prepare the hmm database
  • optional: diamond, bwa, hs-blastn, usearch

Install

pip install traits_finder
in preparation: anaconda download caozhichongchong/traits_finder

Availability

https://pypi.org/project/traits_finder

What do you need to prepare

  1. your reference database (-db your.db), protein sequences (-dbf 1) or dna sequences (-dbf 2)
  2. a mapping file of functions to each reference sequence (sequence function)
  3. all genomes/metagenomes in a folder (-i your.input.folder)
  4. suffix or file extension of your genomes/metagenomes, such as .fasta or .fastq (-fa your.input.genome/metagenome.format)
  5. programs to run: blast for similarity search (-s 1 )or hmm for domain search (-s 2)
  6. optional programs to speedup! diamond, hs-blastn, usearch
  7. optional programs to look at sequence variants! bwa

How to use it

Search traits: boring and slow...

  1. search protein reference sequences in genomes by similarity search
    traits_finder genome -db your.db -i your.input.folder -fa your.input.genome.format --orf your.input.orf.format --r your.output.folder --r16 your.output.folder.for.16s --u diamond --bp blastp -dbf 1 -s 1\

  2. search protein reference sequences in metagenomes by similarity search
    traits_finder meta -db your.db -i your.input.folder -fa your.input.metagenomes.format --r your.output.folder --r16 your.output.folder.for.16s --u diamond --bp blastp -dbf 1 -s 1\

  3. search dna reference sequences in genomes by similarity search
    traits_finder genome -db your.db -i your.input.folder -fa your.input.genome.format --orf your.input.orf.format --r your.output.folder --r16 your.output.folder.for.16s --u usearch.or.hs-blastn --bp blastn -dbf 2 -s 1\

  4. search dba reference sequences in metagenomes by similarity search
    traits_finder meta -db your.db -i your.input.folder -fa your.input.metagenomes.format --r your.output.folder --r16 your.output.folder.for.16s --u usearch.or.hs-blastn --bp blastn -dbf 2 -s 1\

  5. search protein reference sequences in genomes by hmm
    traits_finder genome -db your.db -i your.input.folder -fa your.input.genome.format --orf your.input.orf.format --r your.output.folder --r16 your.output.folder.for.16s --hmm hmmscan -dbf 1 -s 2\

  6. search dna reference sequences in genomes by alignment
    traits_finder genome -db your.db -i your.input.folder -fa your.input.genome.format --orf your.input.orf.format --r your.output.folder --r16 your.output.folder.for.16s --u usearch.or.hs-blastn --bp blastn --bwa bwa -dbf 2 -s 1\

  7. search dna reference sequences in metagenomes by alignment
    traits_finder meta -db your.db -i your.input.folder -fa your.input.metagenomes.format --r your.output.folder --r16 your.output.folder.for.16s --u usearch.or.hs-blastn --bp blastn --bwa bwa -dbf 2 -s 1\

Summarize results: cool and fast!

  1. summarize traits in genome
    traits_finder sum_genome -db your.db -m function.mapping.your.db -i your.input.folder -fa your.input.genome.format --orf your.input.orf.format --r your.output.folder --r16 your.output.folder.for.16s\

  2. summarize traits in metagenomes
    traits_finder sum_meta -db your.db -m function.mapping.your.db -i your.input.folder -fa your.input.metagenomes.format --r your.output.folder --r16 your.output.folder.for.16s\

Results

Copyright

Copyright: An Ni Zhang, Prof. Eric Alm, Alm Lab in MIT
Citation: Not yet, coming soon!
Contact: anniz44@mit.edu

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for traits-finder, version 1.4
Filename, size File type Python version Upload date Hashes
Filename, size traits_finder-1.4-py3.6.egg (2.2 MB) File type Egg Python version 3.6 Upload date Hashes View hashes
Filename, size traits_finder-1.4.tar.gz (1.9 MB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page