Retrieves isolation sources from NCBI given the set of sequences with specified accession numbers. Both nucleotide and protein accessions are accepted.
Project description
**GetIsolationSources** is a small command line utility that, given fasta files containing GenBank IDs in sequence descriptions, generates a per sequence list of isolation sources and their distribution (i.e. number of sequences per isolation source).
It searches for IDs using regular expressions in accordance with [NCBI specifications](http://www.ncbi.nlm.nih.gov/Sequin/acc.html), so the format of description strings does not matter.
To obtain needed information it uses automated Entrez queries, so you need a working Internet connection to perform the analysis. Queries are made in accordance with NCBI load-balance regulations, therefore processing several thousand records may take several minutes or even longer.
It is distributed as a source code supporting python setup tools.
**GetIsolationSources uses [BioPython](http://biopython.org/wiki/Main_Page).** So if you're using source code distribution, the latest version of [BioPython](http://biopython.org/wiki/Main_Page) should be installed.
[**Downaloads**](https://github.com/allista/GetIsolationSource/releases)
***
**GetIsolationSources** by [**Allis Tauri**](https://github.com/allista) is licensed under the [MIT](https://github.com/allista/GetIsolationSources/blob/master/LICENSE) license.
It searches for IDs using regular expressions in accordance with [NCBI specifications](http://www.ncbi.nlm.nih.gov/Sequin/acc.html), so the format of description strings does not matter.
To obtain needed information it uses automated Entrez queries, so you need a working Internet connection to perform the analysis. Queries are made in accordance with NCBI load-balance regulations, therefore processing several thousand records may take several minutes or even longer.
It is distributed as a source code supporting python setup tools.
**GetIsolationSources uses [BioPython](http://biopython.org/wiki/Main_Page).** So if you're using source code distribution, the latest version of [BioPython](http://biopython.org/wiki/Main_Page) should be installed.
[**Downaloads**](https://github.com/allista/GetIsolationSource/releases)
***
**GetIsolationSources** by [**Allis Tauri**](https://github.com/allista) is licensed under the [MIT](https://github.com/allista/GetIsolationSources/blob/master/LICENSE) license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Close
Hashes for GetIsolationSources-1.5.1.2.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 52b06e0f5264722f77d0eeb5e8222b2e295efbf61ab1a64919e12ad595b5f6e8 |
|
MD5 | 75681ce77e81c5898273bd01a6f09760 |
|
BLAKE2b-256 | ff47d394f16e74decedb2400b8635933230e8bb49832971fa781607c83a60718 |