Retrieves isolation sources from NCBI given the set of sequences with specified accession numbers. Both nucleotide and protein accessions are accepted.
Project description
**GetIsolationSources** is a small command line utility that, given fasta files containing GenBank IDs in sequence descriptions, generates a per sequence list of isolation sources and their distribution (i.e. number of sequences per isolation source).
It searches for IDs using regular expressions in accordance with [NCBI specifications](http://www.ncbi.nlm.nih.gov/Sequin/acc.html), so the format of description strings does not matter.
To obtain needed information it uses automated Entrez queries, so you need a working Internet connection to perform the analysis. Queries are made in accordance with NCBI load-balance regulations, therefore processing several thousand records may take several minutes or even longer.
It is distributed as a source code supporting python setup tools.
**GetIsolationSources uses [BioPython](http://biopython.org/wiki/Main_Page).** So if you're using source code distribution, the latest version of [BioPython](http://biopython.org/wiki/Main_Page) should be installed.
[**Downaloads**](https://github.com/allista/GetIsolationSource/releases)
***
**GetIsolationSources** by [**Allis Tauri**](https://github.com/allista) is licensed under the [MIT](https://github.com/allista/GetIsolationSources/blob/master/LICENSE) license.
It searches for IDs using regular expressions in accordance with [NCBI specifications](http://www.ncbi.nlm.nih.gov/Sequin/acc.html), so the format of description strings does not matter.
To obtain needed information it uses automated Entrez queries, so you need a working Internet connection to perform the analysis. Queries are made in accordance with NCBI load-balance regulations, therefore processing several thousand records may take several minutes or even longer.
It is distributed as a source code supporting python setup tools.
**GetIsolationSources uses [BioPython](http://biopython.org/wiki/Main_Page).** So if you're using source code distribution, the latest version of [BioPython](http://biopython.org/wiki/Main_Page) should be installed.
[**Downaloads**](https://github.com/allista/GetIsolationSource/releases)
***
**GetIsolationSources** by [**Allis Tauri**](https://github.com/allista) is licensed under the [MIT](https://github.com/allista/GetIsolationSources/blob/master/LICENSE) license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Close
Hashes for GetIsolationSources-1.5.2.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 71f781178896a74d95d68f9ce4e314cc5bebee9e6b6aa966391ac94ea0d5a33b |
|
MD5 | bdfb55f0d3c2fe6f52ade0d0312b722b |
|
BLAKE2b-256 | 51667a6adcb29a2f39619317a37e792845e2fc840576cf13a2261087c909dc6e |