Skip to main content

To get taxa information of sequences from BOLD system

Project description

bold_identification

1 Introduction

see https://github.com/linzhi2013/bold_identification.

This is a Python3 package which can get the taxonomy information of sequences from BOLD http://www.boldsystems.org/index.php.

Currently, bold_identification only runs on Mac OS, Windows 64bit, Linux.

2 Installation

install with bioconda

Or with pip

$ pip install bold_identification

There will be a command bold_identification created under the same directory as your pip command.

3 Usage

run bold_identification

$ bold_identification
usage: bold_identification [-h] -i <str> [-f <str>] -o <str>
                          [-d {COX1,COX1_SPECIES,COX1_SPECIES_PUBLIC,COX1_L640bp,ITS,MATK_RBCL}]
                          [-n <int>] [-r <int>] [-c] [-D] [--version]

To identify taxa of given sequences from BOLD (http://www.boldsystems.org/).
Some sequences can fail to get taxon information, which can be caused by
TimeoutException if your network to the BOLD server is bad.
Those sequences will be output in the file '*.TimeoutException.fasta'.

You can:
1) run another searching with the same command directly (but add -c option);
2) lengthen the time to wait for each query (-t option);
3) increase submission times (-r option) for a sequence.

Also, the sequences without BOLD matches will be output in the
file '*.NoBoldMatchError.fasta'

By mengguanliang AT genomics DOT cn.
See https://github.com/linzhi2013/bold_identification.

version: 0.0.22

optional arguments:
  -h, --help            show this help message and exit
  -i <str>              input file name
  -f <str>              input file format [fasta]
  -o <str>              outfile
  -d {COX1,COX1_SPECIES,COX1_SPECIES_PUBLIC,COX1_L640bp,ITS,MATK_RBCL}
                        database to search [COX1]
  -n <int>              how many first top hits will be output. [1]
  -r <int>              Maximum submission time for a sequence, useful for
                        handling TimeOutException. [4]
  -c                    continuous mode, jump over the ones already in "-o"
                        file, will resubmit all the remained. use "-cc" to
                        also jump over the ones in "*.NoBoldMatchError.fasta"
                        file. [False]
  -D                    debug mode output [False]
  --version             show program's version number and exit

4 Citation

When you use bold_identification in your study, please cite:

Yang, Chentao, Shangjin Tan, Guanliang Meng, David G. Bourne, Paul A. O'brien, Junqiang Xu, Sha Liao, Ao Chen, Xiaowei Chen, and Shanlin Liu. "Access COI barcode efficiently using high throughput Single End 400 bp sequencing." BioRxiv (2018): 498618. DOI: 10.1101/498618

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bold_identification-0.0.26.tar.gz (23.6 kB view details)

Uploaded Source

File details

Details for the file bold_identification-0.0.26.tar.gz.

File metadata

  • Download URL: bold_identification-0.0.26.tar.gz
  • Upload date:
  • Size: 23.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.0.0.post20200311 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.6.10

File hashes

Hashes for bold_identification-0.0.26.tar.gz
Algorithm Hash digest
SHA256 42276a4f4f9a84190be52dde45bfe91769db8a171d6da014f1ec3bd928126d5c
MD5 ad05b0cbd4874df4b9c4eca79eb8c49a
BLAKE2b-256 0862827879d0b756ca934f0360210ef955e4da7596d63efb0d03638275b50a7e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page