Skip to main content

CLI and utilities for Genetic analysis and database interface

Project description

geneutils

PyPI version CI geneutils DOI

This tool was originally inspired by the work here for dealing with nucleotide sequences and more tools will be added as needed. This toolbox requires python 3 and can be installed.

Table of contents

Installation

This assumes that you have native python 3 & pip installed in your system, you can test this by going to the terminal (or windows command prompt) and trying

python and then pip list

If you get no errors and you have python 3 or higher you should be good to go. To install geneutils you can install using two methods

pip install geneutils

or you can also try

pip install geneutils --user

To upgrade to a new version from old you can try

pip install geneutils --upgrade

or

pip install geneutils --user --upgrade

or you can also try

git clone https://github.com/samapriya/geneutils.git
cd geneutils
python setup.py install

Though there is a conda installer for linux/mac for geneutils

conda install -c samapriya geneutils

The recommended way would be to use the conda environment/terminal and then do a pip install geneutils

Installation is an optional step; the application can be also run directly by executing geneutils.py script. The advantage of having it installed is being able to execute porg as any command line tool. I recommend installation within virtual environment. If you don't want to install, browse into the geneutils folder and try python geneutils.py -h to get to the same result.

geneutils cli tools

This is a command line tool and it is designed to simply call the tools you need.

geneutils_main

init

Turns out there are benefits of registering for a NCBI account and to use your email address and your API key. Apart from NCBI having a way of contacting you, the API key raises the rate limit imposed on your queries. This is a recommended step and a user should not skip thought it is possible to use the blashit tool without the API key. From the NCBI account page you can find this information

E-utils users are allowed 3 requests/second without an API key. Create an API key to increase your e-utils limit to 10 requests/second......Only one API Key per user. Replacing or deleting will inactivate the current key. Use this key by passing it with api_key=API_KEY parameter.

Generate your API Key ncbi_apikey

The init tool saves your email address and API key to be saved in your local machine which can be used instead of typing out your email over and over again. The API key is a clear entry meaning you cannot see when you type in or paste your API key for safety. ncbi_cred

blasthit

This script is intended for taxonomic annotation of blast results (blastn, tblastn, blastp or blastx) saved in Hit Table CSV format where GenBank accession numbers are in the 4th column. It uses efetch function from Bio.Entrez package to get information about accessions from GenBank Nucleotide (Nuccore) or Protein databases. The output is an annotated CSV file "*_annotated.csv" with the following columns added:

  • Record name
  • Species name
  • Full taxonomy
  • Reference
  • Date of update

geneutils_bhits

arguments description
path Pathway to csv-formatted Hit Table file with blastn results. Positional argument
db For the output of nucleotide blast or tblastn, use n. For the output of protein blast or blastx, use p. Positional argument
email Your NCBI email. Optional argument
-h, --help Show help message and exit. Optional argument

Changelog

0.0.6

  • Dedupes accession ids & remove empty rows
  • Subset handling is cleaner and better error handling with try except blocks
  • general code improvements to run style and functionality
  • performance improvements to overall runtime

0.0.5

  • Fixed issue with empty rows
  • Includes version check
  • general code cleanup

0.0.4

  • Fixed issue with path basename

0.0.3

  • Fixed issue with repeating email id for accession blocks
  • Updated ReadMe to include geneutils init as first step.
  • Fixed CSV write issue outside loop

0.0.2

  • Added credential tool to save NCBI email and API Key
  • Minor fixes to overall functionality.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

geneutils-0.0.6.tar.gz (7.3 kB view details)

Uploaded Source

Built Distribution

geneutils-0.0.6-py3-none-any.whl (9.3 kB view details)

Uploaded Python 3

File details

Details for the file geneutils-0.0.6.tar.gz.

File metadata

  • Download URL: geneutils-0.0.6.tar.gz
  • Upload date:
  • Size: 7.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.5.0.1 requests/2.26.0 setuptools/47.1.0 requests-toolbelt/0.8.0 tqdm/4.33.0 CPython/3.7.9

File hashes

Hashes for geneutils-0.0.6.tar.gz
Algorithm Hash digest
SHA256 97c2ca6696b3392d24763bf57dc9cb4ce981ed31f182f9126f1562c4ae4b1699
MD5 b75e8c31e3fd5367b1a3089de0af2684
BLAKE2b-256 b50ebfc84eff610abefa546f5e816339221b50588a92ba801057c6c4a6ccd2d8

See more details on using hashes here.

File details

Details for the file geneutils-0.0.6-py3-none-any.whl.

File metadata

  • Download URL: geneutils-0.0.6-py3-none-any.whl
  • Upload date:
  • Size: 9.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.5.0.1 requests/2.26.0 setuptools/47.1.0 requests-toolbelt/0.8.0 tqdm/4.33.0 CPython/3.7.9

File hashes

Hashes for geneutils-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 dff9b9f06ce16c4b95deb6bcbe6e56a58f159d783d15e3431336b2ba177d56dc
MD5 4675eff4dcbcface1fc9e0b467a2fefc
BLAKE2b-256 4eb8a2a611f7d7c568ab2edaf8e484234b06ab8d8bdaad25e92724982104fffa

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page