Skip to main content

It queries the BOLD database to get identification of taxa based on COI sequences

Project description

Pypi index Build Status Cover alls Dependencies status Downloads

This script accepts FASTA files containing COI sequences. It queries the BOLD database http://boldsystems.org/ in order to get the taxa identification based on the sequences.

Run this way

  • clone repository:

    cd $USERAPPL
    git clone https://github.com/carlosp420/bold_retriever.git
  • install dependencies:

    cd bold_retriever
    module load biopython-env
    pip install -r requirements.txt
  • run software

You have to choose one of the databases available from BOLD http://www.boldsystems.org/index.php/resources/api?type=idengine and enter it as argument:

  • COX1_SPECIES

  • COX1

  • COX1_SPECIES_PUBLIC

  • COX1_L640bp

For example:

python bold_retriever.py -f ZA2013-0565.fasta -db COX1_SPECIES
  • output:

    bold_id        seq_id            similarity  collection_country  division  taxon                        class    order    family
    FIDIP558-11    TE-14-27_FHYP_av  0.9884      Finland             animal    Diptera                      Insecta  Diptera  None
    GBDP6413-09    TE-14-27_FHYP_av  0.9242      None                animal    Ornithomya anchineura        Insecta  Diptera  Hippoboscidae
    GBDP2916-07    TE-14-27_FHYP_av  0.922       None                animal    Stenepteryx hirundinis       Insecta  Diptera  Hippoboscidae
    GBDP2919-07    TE-14-27_FHYP_av  0.9149      None                animal    Ornithomya biloba            Insecta  Diptera  Hippoboscidae
    GBDP2908-07    TE-14-27_FHYP_av  0.9078      None                animal    Ornithoctona sp. P-20        Insecta  Diptera  Hippoboscidae
    GBDP2918-07    TE-14-27_FHYP_av  0.9076      None                animal    Ornithomya chloropus         Insecta  Diptera  Hippoboscidae
    GBDP2935-07    TE-14-27_FHYP_av  0.8936      None                animal    Crataerina pallida           Insecta  Diptera  Hippoboscidae
    GBMIN26225-13  TE-14-27_FHYP_av  0.8889      None                animal    Lucilia sericata             Insecta  Diptera  Calliphoridae
    GBDP5820-09    TE-14-27_FHYP_av  0.8833      None                animal    Coenosia tigrina             Insecta  Diptera  Muscidae
    GBMIN26204-13  TE-14-27_FHYP_av  0.883       None                animal    Lucilia cuprina              Insecta  Diptera  Calliphoridae
    GBMIN18768-13  TE-14-27_FHYP_av  0.8823      Brazil              animal    Ornithoctona erythrocephala  Insecta  Diptera  Hippoboscidae

Full documentation

See the full documentation at http://bold-retriever.readthedocs.org

History

  • v0.2.2: Killed bug taxon search.

  • v0.2.1: Killed bug in scraping web Public_BIN for species ID.

  • v0.2.0: Scraping web Public_BIN for species ID.

  • v0.1.9: Added request_id test and option to run fuction in debug mode.

  • v0.1.8: Fixed bug for exception when BOLD sends empty list of taxon names.

  • v0.1.7: Fixed bug for exception when BOLD sends empty list of taxon names.

  • v0.1.6: Append taxon identification results to file as we get them.

  • v0.1.5: Additionat tests coverage 92%

  • v0.1.4: Fixed bug in taxon_search function

  • v0.1.3: Coverage 75%

  • v0.1.2: Pep8 and test coverage 69%

  • v0.1.1: Packaged as Python module.

  • v0.1.0: You can specify which BOLD datase should be used for BLAST of FASTA sequences.

  • v0.0.7: Catching exception for NULL, list and text returned instead of XML from BOLD.

  • v0.0.6: Catching exception for malformed XML from BOLD.

  • v0.0.5: Catch exception when BOLD sends funny data such as {"481541":[]}.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bold_retriever-0.2.2.tar.gz (21.9 kB view hashes)

Uploaded Source

Built Distribution

bold_retriever-0.2.2-py2.py3-none-any.whl (8.8 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page