Skip to main content

search and summarize traits in genomes and metagenomes

Project description

traits_finder

Introduction

  • traits_finder searches and summarizes traits in genomes and metagenomes
  • input: reference database and folder of genomes/metagenomes
  • requirement: python >= 3.0, blast or hmm
  • requirement: for hmm, you need to prepare the hmm database
  • optional: diamond, bwa, hs-blastn, usearch, mafft, fasttree

Install

pip install traits_finder
in preparation: anaconda download caozhichongchong/traits_finder

latest version (unstable though)

git clone https://github.com/caozhichongchong/traits_finder.git
cd traits_finder
python setup.py build
python setup.py install

Availability

https://pypi.org/project/traits_finder

What do you need to prepare

  1. your reference database (-db your.db), protein sequences (-dbf 1) or dna sequences (-dbf 2)
  2. a mapping file of functions to each reference sequence (sequence function)
  3. all genomes/metagenomes in a folder (-i your.input.folder)
  4. suffix or file extension of your genomes/metagenomes, such as .fasta or .fastq (-fa your.input.genome/metagenome.format)
  5. programs to run: blast for similarity search (-s 1 )or hmm for domain search (-s 2)
  6. optional programs to speedup! diamond, hs-blastn (or usearch for 16S extracting), usearch (necessary for HGT and HGT_sum)
  7. optional programs to look at sequence variants! bwa, mafft, fasttree

How to use it

Search traits: boring and slow...

Database installed in traits_finder: antibiotic resistant genes (-db ARG), butyrate producing genes (-db but)

  1. search protein reference sequences in genomes (traits_finder genome) or mobile genetic elements (traits_finder mge) by similarity search
    traits_finder genome -db your.db -i your.input.folder -fa your.input.genome.format --orf your.input.orf.format --r your.output.folder --r16 your.output.folder.for.16s --u diamond --bp blastp -dbf 1 -s 1\

  2. search protein reference sequences in metagenomes by similarity search
    traits_finder meta -db your.db -i your.input.folder -fa your.input.metagenomes.format --r your.output.folder --r16 your.output.folder.for.16s --u diamond --bp blastp -dbf 1 -s 1\

  3. search dna reference sequences in genomes by similarity search
    traits_finder genome -db your.db -i your.input.folder -fa your.input.genome.format --orf your.input.orf.format --r your.output.folder --r16 your.output.folder.for.16s --u usearch.or.hs-blastn --bp blastn -dbf 2 -s 1\

  4. search dba reference sequences in metagenomes by similarity search
    traits_finder meta -db your.db -i your.input.folder -fa your.input.metagenomes.format --r your.output.folder --r16 your.output.folder.for.16s --u usearch.or.hs-blastn --bp blastn -dbf 2 -s 1\

  5. search protein reference sequences in genomes by hmm
    traits_finder genome -db your.db -i your.input.folder -fa your.input.genome.format --orf your.input.orf.format --r your.output.folder --r16 your.output.folder.for.16s --hmm hmmscan -dbf 1 -s 2\

  6. search dna reference sequences in genomes by alignment
    traits_finder genome -db your.db -i your.input.folder -fa your.input.genome.format --orf your.input.orf.format --r your.output.folder --r16 your.output.folder.for.16s --u usearch.or.hs-blastn --bp blastn --bwa bwa -dbf 2 -s 1\

  7. search dna reference sequences in metagenomes by alignment
    traits_finder meta -db your.db -i your.input.folder -fa your.input.metagenomes.format --r your.output.folder --r16 your.output.folder.for.16s --u usearch.or.hs-blastn --bp blastn --bwa bwa -dbf 2 -s 1\

Summarize results: cool and fast!

  1. summarize traits in genome
    traits_finder sum_genome -db your.db -m function.mapping.your.db -i your.input.folder -fa your.input.genome.format --orf your.input.orf.format --r your.output.folder --r16 your.output.folder.for.16s\

  2. summarize traits in metagenomes
    traits_finder sum_meta -db your.db -m function.mapping.your.db -i your.input.folder -fa your.input.metagenomes.format --r your.output.folder --r16 your.output.folder.for.16s --meta metadata.txt\

HGT finder and summarizing: cool and fast! (still-testing)

Results

Copyright

Copyright: An Ni Zhang, Prof. Eric Alm, Alm Lab in MIT
Citation: Not yet, coming soon!
Contact: anniz44@mit.edu

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

traits_finder-1.6.tar.gz (4.7 MB view details)

Uploaded Source

Built Distribution

traits_finder-1.6-py3.6.egg (5.0 MB view details)

Uploaded Source

File details

Details for the file traits_finder-1.6.tar.gz.

File metadata

  • Download URL: traits_finder-1.6.tar.gz
  • Upload date:
  • Size: 4.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.8.0 tqdm/4.32.2 CPython/3.6.7

File hashes

Hashes for traits_finder-1.6.tar.gz
Algorithm Hash digest
SHA256 ea30a6745418abfd7814942053cc84d4700e2b0bdc3770044688fcc14d03f91f
MD5 2a434fa461dfe4beb53ebc7eaf1464b4
BLAKE2b-256 b2371127f8b62abf4957ab78fc410d10a516f468a6557cd815f438528686b734

See more details on using hashes here.

File details

Details for the file traits_finder-1.6-py3.6.egg.

File metadata

  • Download URL: traits_finder-1.6-py3.6.egg
  • Upload date:
  • Size: 5.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.8.0 tqdm/4.32.2 CPython/3.6.7

File hashes

Hashes for traits_finder-1.6-py3.6.egg
Algorithm Hash digest
SHA256 24a217d362ed7eabf4aebb72214a71863c3dc6089e647e15cbec75a13285b340
MD5 f79baef6ec513e1e7a87e9980ad26470
BLAKE2b-256 ff081e8cf043964ee7a7037def01e71321ba3358926bb57d7d96d6bd4f6797e0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page