Skip to main content
This is a pre-production deployment of Warehouse. Changes made here affect the production instance of PyPI (pypi.python.org).
Help us improve Python packaging - Donate today!

It queries the BOLD database to get identification of taxa based on COI sequences

Project Description

This script accepts FASTA files containing COI sequences. It queries the BOLD database http://boldsystems.org/ in order to get the taxa identification based on the sequences.

Run this way

  • clone repository:

    cd $USERAPPL
    git clone https://github.com/carlosp420/bold_retriever.git
    
  • install dependencies (python2.7):

    cd bold_retriever
    module load biopython-env
    pip install -r requirements.txt
    
  • run software

You have to choose one of the databases available from BOLD http://www.boldsystems.org/index.php/resources/api?type=idengine and enter it as argument:

  • COX1_SPECIES
  • COX1
  • COX1_SPECIES_PUBLIC
  • COX1_L640bp

For example:

python bold_retriever.py -f ZA2013-0565.fasta -db COX1_SPECIES
  • output:

    seq_id  bold_id       similarity  division  class       order       family        species                collection_country
    OTU_99  FBNE064-11    1           animal    Insecta     Neuroptera  Hemerobiidae  Hemerobius pini        Germany
    OTU_99  NEUFI079-11   1           animal    Insecta     Neuroptera  Hemerobiidae  Hemerobius pini        Finland
    OTU_99  FBNE172-13    0.9937      animal    Insecta     Neuroptera  Hemerobiidae  Hemerobius atrifrons   Germany
    OTU_99  FBNE162-13    0.9936      animal    Insecta     Neuroptera  Hemerobiidae  Hemerobius contumax    Austria
    OTU_99  TTSOW138-09   0.9811      animal    Insecta     Neuroptera  Hemerobiidae  Hemerobius ovalis      Canada
    OTU_99  CNPAH380-13   0.9811      animal    Insecta     Neuroptera  Hemerobiidae  Hemerobius             Canada
    OTU_99  CNKOF1602-14  0.9811      animal    Insecta     Neuroptera  Hemerobiidae  Hemerobius pinidumus   Canada
    OTU_99  NRAS173-11    0.9748      animal    Insecta     Neuroptera  Hemerobiidae  Hemerobius conjunctus  Canada
    OTU_99  SSBAE2911-13  0.9748      animal    Collembola  None        None          Collembola             Canada
    OTU_99  CNPAQ117-13   0.9686      animal    Insecta     Neuroptera  Hemerobiidae  Hemerobius humulinus   Canada
    

Speed

bold_retriever uses the library Twisted for performing asynchronous calls. This speeds up the total processing time:

Full documentation

See the full documentation at http://bold-retriever.readthedocs.org

History

  • v1.0.0: Using Twisted for asynchronous calls and increase in speed.
  • v0.2.4: Reorganizing columns in output file. Querying the API for family
    name of taxa.
  • v0.2.2: Killed bug taxon search.
  • v0.2.1: Killed bug in scraping web Public_BIN for species ID.
  • v0.2.0: Scraping web Public_BIN for species ID.
  • v0.1.9: Added request_id test and option to run fuction in debug mode.
  • v0.1.8: Fixed bug for exception when BOLD sends empty list of taxon names.
  • v0.1.7: Fixed bug for exception when BOLD sends empty list of taxon names.
  • v0.1.6: Append taxon identification results to file as we get them.
  • v0.1.5: Additionat tests coverage 92%
  • v0.1.4: Fixed bug in taxon_search function
  • v0.1.3: Coverage 75%
  • v0.1.2: Pep8 and test coverage 69%
  • v0.1.1: Packaged as Python module.
  • v0.1.0: You can specify which BOLD datase should be used for BLAST of FASTA sequences.
  • v0.0.7: Catching exception for NULL, list and text returned instead of XML from BOLD.
  • v0.0.6: Catching exception for malformed XML from BOLD.
  • v0.0.5: Catch exception when BOLD sends funny data such as {"481541":[]}.
Release History

Release History

This version
History Node

1.0.0

History Node

0.2.4

History Node

0.2.3

History Node

0.2.2

History Node

0.2.1

History Node

0.2.0

History Node

0.1.9

History Node

0.1.8

History Node

0.1.7

History Node

0.1.6

History Node

0.1.5

History Node

0.1.4

History Node

0.1.3

History Node

0.1.2

History Node

0.1.1

Download Files

Download Files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
bold_retriever-1.0.0.tar.gz (25.6 kB) Copy SHA256 Checksum SHA256 Source Nov 6, 2014

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting