Retrieves isolation sources from NCBI given the set of sequences with specified accession numbers. Both nucleotide and protein accessions are accepted.
Project description
**GetIsolationSources** is a small command line utility that, given fasta files containing GenBank IDs in sequence descriptions, generates a per sequence list of isolation sources and their distribution (i.e. number of sequences per isolation source).
It searches for IDs using regular expressions in accordance with [NCBI specifications](http://www.ncbi.nlm.nih.gov/Sequin/acc.html), so the format of description strings does not matter.
To obtain needed information it uses automated Entrez queries, so you need a working Internet connection to perform the analysis. Queries are made in accordance with NCBI load-balance regulations, therefore processing several thousand records may take several minutes or even longer.
It is distributed as a source code supporting python setup tools.
**GetIsolationSources uses [BioPython](http://biopython.org/wiki/Main_Page).** So if you're using source code distribution, the latest version of [BioPython](http://biopython.org/wiki/Main_Page) should be installed.
[**Downaloads**](https://github.com/allista/GetIsolationSource/releases)
***
**GetIsolationSources** by [**Allis Tauri**](https://github.com/allista) is licensed under the [MIT](https://github.com/allista/GetIsolationSources/blob/master/LICENSE) license.
It searches for IDs using regular expressions in accordance with [NCBI specifications](http://www.ncbi.nlm.nih.gov/Sequin/acc.html), so the format of description strings does not matter.
To obtain needed information it uses automated Entrez queries, so you need a working Internet connection to perform the analysis. Queries are made in accordance with NCBI load-balance regulations, therefore processing several thousand records may take several minutes or even longer.
It is distributed as a source code supporting python setup tools.
**GetIsolationSources uses [BioPython](http://biopython.org/wiki/Main_Page).** So if you're using source code distribution, the latest version of [BioPython](http://biopython.org/wiki/Main_Page) should be installed.
[**Downaloads**](https://github.com/allista/GetIsolationSource/releases)
***
**GetIsolationSources** by [**Allis Tauri**](https://github.com/allista) is licensed under the [MIT](https://github.com/allista/GetIsolationSources/blob/master/LICENSE) license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file GetIsolationSources-1.5.2.tar.gz
.
File metadata
- Download URL: GetIsolationSources-1.5.2.tar.gz
- Upload date:
- Size: 7.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 71f781178896a74d95d68f9ce4e314cc5bebee9e6b6aa966391ac94ea0d5a33b |
|
MD5 | bdfb55f0d3c2fe6f52ade0d0312b722b |
|
BLAKE2b-256 | 51667a6adcb29a2f39619317a37e792845e2fc840576cf13a2261087c909dc6e |