Skip to main content

Calculates SEGUID, cSEGUID & lSEGUID checksums for biological sequences

Project description

seguid_calculator

Conda Package Setuptools Package Pytest Pyinstaller

seguid_calculator_small.png

Seguid_calculator is a GUI for calculating the uSEGUID, lSEGUID and cSEGUID checksums for a biological sequence (DNA, RNA or protein).

Installation

The quickest way to use seguid_calculator is by downloading one of the apps, they requre no installation at all. They are available here. Pick the correct file for your system:

OS File
Windows seguid_calculator.exe
macOS seguid_calculator_for_mac.zip
Linux seguid_calculator

No DEB or RPM packages yet. Please let me know if they are needed. These apps and packages are built automatically using Github actions. There is also an online version (see links at the end of this page.

Source installation

Installation from pypi:

pip install seguid_calculator

What does it do ?

The SEGUID checksum is defined as the base64 encoded SHA-1 cryptographic hash of a primary biological sequence in uppercase. SEGUID was suggested by Babnigg and Giometti as a way to provide stable identifiers of protein sequences in databases for cross referencing.

There are several implementations of SEGUID calculation available, such as the one in Biopython. Bio.SeqUtils.CheckSum. See slides and the Biopython wiki.

See also this blog post on the subject.

uSEGUID

uSEGUID is a base64url encoded version of SHA-1 where forward slash and plus ("/" , "+") characters of standard base64 are replaced by '-' and '_'. This makes it possible to use the checksum as a part of a URL.

cSEGUID

Circular uSEGUID or cSEGUID is the uSEGUID checksum for circular (DNA) sequences. As there are many permutations of a circular sequence, the use of the uSEGUID checksum directly is impractical as there would be many checksums for the same sequence. The cSEGUID is defined as the SEGUID of the lexicographically minimal string rotation of a sequence or its reverse complement (whichever is lexicographically smaller). The cSEGUID provide a unique and stable identifier for circular sequences, such as plasmids.

Example

The cSEGUID checksum can be useful to quickly determine if two sequences refer to the same plasmid vector. The sequence of the plasmid pFA6a-GFPS65T-kanMX6 is available from Genbank and from other sources such as the Forsburg lab, sequence here, a copy of which was saved here.

Both sequences are the same size and claim to describe the same vector. Analysis of both sequences in seguid_calculator proves that both sequences are in fact representations of the same sequence by their identical cSEGUIDs:

Genbank

alt text

Forsburg

alt text

lSEGUID

The lSEGUID is the uSEGUID of the lexicographically smallest of the sense or anti-sense strands of a blunt double stranded DNA sequence. This means that if a sequence and its reverse compliment have the same lSEGUIDs. This can be useful to identify double stranded DNA sequences, regardless of the form they are presented.

Implementation

Seguid_calculator is written in python 3 with wxPython 4 which is the only dependence. Development happens on Github.

Online version

There is also an online version built with flask and hosted on pythonanywhere.

seguid_calculator_flask

Click here or on the image above to take you to the website.

How to install on pythonanywhere:

16:33 ~ $ mkvirtualenv --python=python3.9 MyVirtualenv                                                                 
created virtual environment CPython3.9.5.final.0-64 in 13108ms                                                         
  creator CPython3Posix(dest=/home/seguidcalculator/.virtualenvs/MyVirtualenv, clear=False, no_vcs_ignore=False, global
=False)                                                                                                                
  seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/home/seguidca
lculator/.local/share/virtualenv)                                                                                      
    added seed packages: pip==21.3, setuptools==58.2.0, wheel==0.37.0                                                  
  activators BashActivator,CShellActivator,FishActivator,PowerShellActivator,PythonActivator,XonshActivator            
virtualenvwrapper.user_scripts creating /home/seguidcalculator/.virtualenvs/MyVirtualenv/bin/predeactivate             
virtualenvwrapper.user_scripts creating /home/seguidcalculator/.virtualenvs/MyVirtualenv/bin/postdeactivate            
virtualenvwrapper.user_scripts creating /home/seguidcalculator/.virtualenvs/MyVirtualenv/bin/preactivate               
virtualenvwrapper.user_scripts creating /home/seguidcalculator/.virtualenvs/MyVirtualenv/bin/postactivate              
virtualenvwrapper.user_scripts creating /home/seguidcalculator/.virtualenvs/MyVirtualenv/bin/get_env_details           
(MyVirtualenv) 16:36 ~ $ pip install flask flask-wtf wtforms                                                           
Looking in links: /usr/share/pip-wheels                                                                                
Collecting flask                                                                                                       
  Downloading Flask-2.2.2-py3-none-any.whl (101 kB)                                                                    
     |████████████████████████████████| 101 kB 2.1 MB/s                                                                
Collecting flask-wtf                                                                                                   
(MyVirtualenv) 16:37 ~ $                                                                                               
(MyVirtualenv) 16:40 ~ $ git checkout https://github.com/BjornFJohansson/seguid_calculator.git                         
fatal: not a git repository (or any parent up to mount point /home)                                                    
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).                                             
(MyVirtualenv) 16:43 ~ $ git clone https://github.com/BjornFJohansson/seguid_calculator.git                            
Cloning into 'seguid_calculator'...                                                                                    
remote: Enumerating objects: 1555, done.
remote: Counting objects: 100% (441/441), done.
remote: Compressing objects: 100% (159/159), done.
remote: Total 1555 (delta 236), reused 437 (delta 232), pack-reused 1114
Receiving objects: 100% (1555/1555), 76.46 MiB | 53.41 MiB/s, done.                                                    
Resolving deltas: 100% (879/879), done.                                                                                
Updating files: 100% (48/48), done.                                                                                    
(MyVirtualenv) 16:44 ~ $ ls                                                                                            
README.txt  seguid_calculator                                                                                          
(MyVirtualenv) 16:44 ~ $

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

seguid_calculator-1.27.tar.gz (489.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

seguid_calculator-1.27-py3-none-any.whl (29.0 kB view details)

Uploaded Python 3

File details

Details for the file seguid_calculator-1.27.tar.gz.

File metadata

  • Download URL: seguid_calculator-1.27.tar.gz
  • Upload date:
  • Size: 489.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.3

File hashes

Hashes for seguid_calculator-1.27.tar.gz
Algorithm Hash digest
SHA256 528169beb91aa80e34649d3bae67bfd8f0225ab70dc38ae4046fe7e546c9fc5c
MD5 e67ddab41aadd836d3dca3ad121aec4a
BLAKE2b-256 6f28d7f36a752c219b5f31d79fa73aca78a4427c8f509b74c5d6bc414cda68a0

See more details on using hashes here.

File details

Details for the file seguid_calculator-1.27-py3-none-any.whl.

File metadata

File hashes

Hashes for seguid_calculator-1.27-py3-none-any.whl
Algorithm Hash digest
SHA256 ee6b57a74a9c9318eae353e48a3227fe63da7423b7415ad6b1195609054bc1f4
MD5 28ee0084223e72a4816d0a3dbde59898
BLAKE2b-256 d431a15409a8955aa2c26badc3ee22fdd293058c4a455efccbf8cd1097fa7c19

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page