A program that calculates the skew of two selectable nucleotides for a genome sequence in FASTA or GenBank format.
Project description
GenSkew is an application for computing and plotting nucleotide skew data.
For a more detailed description and the online version of genskew take a look at the website: genskew.csb.univie.ac.at
GenSkew calculates the incremental and the cumulative skew of two selectable nucleotides for a given sequence according to this formula: Skew = (nucleotide1 - nucleotide2) / (nucleotide1 + nucleotide2)
The results are provided as data table and as graphical plot. The global minimum and maximum are displayed in the cumulative graph. The minimum and maximum of a GC-skew can be used to predict the origin of replication (minimum) and the terminus location (maximum) in prokaryotic genomes.
There are three versions of this Program: Genskew_univiecube (the python library described below), Genskew_cc (a commandline client) and GUIskew (the Graphical Version of Genskew).
Installing the program with the command you can copy above will install all three of them. The Graphical Interface can be started via the command python -m GUIskew. By calling python3 -m genskew -h you will see a detailed description how to use the commandline interface. It can analyze multiple sequences in one command.
For using the library you first have to specify the sequence as an object:
import genskew_univiecube as gs
sequence = "GATCCTAGATTAAGC"
name = gs.Object(sequence, "G", "C", stepsize, windowsize)
In this example the sequence is a string and the first nucleotide is G and the second is C. Stepsize and Windowsize don't have to be specified, if they are not specified they will be automatically calculated to best fit the Graph. This is usefull if multiple sequences are processed after another.
After the Object is defined, we need to generate the results:
import genskew_univiecube as gs
sequence = gs.gen_sequence(filelocation) name = gs.Object(sequence, "G", "C", stepsize, windowsize) result = gs.Object.gen_results(name)
In this example the sequence is generated by calling gen_sequence, this takes a fasta or genbank file and outputs a string with the sequence in it. The results can be retrieved as follows:
import genskew_univiecube as gs
sequence = gs.gen_sequence(filelocation) name = gs.Object(sequence, "G", "C", stepsize, windowsize) result = gs.Object.gen_results(name)
print(result.skew) gs.plot_sequence(result, filelocation, outputfolder, output_filetype, dpi, skewi)
There are different results: .skew (which will output the skew as a listof y values), .x (which will output the corresponding x values), .cumulative (which will output the cumulative skew as y values), .max_cm_position and .min_cm_position (outputs the x value of the max / min cumulative), .stepsize and .windowsize (outputs as integer), .nuc_1 and .nuc_2 (outputs the first and second nucleotide as a string)
plot_sequence plots and saves a graph of the skew. The arguments dpi, out_filetype and outputfolder are optional, the default output file type is png and the outputfolder is by default the folder in which the sequence file was (filelocation). The dpi is calculated according to the size of the graph.
The function input_files(file_locations) will check the given path for fasta , gb or .gz files and then returns everything in a list. file_location has to be a list and can contain direct paths or folders.
As of 0.1.0 the SkewIT¹ Index has been added. It can be activated by having the "skewi" parameter as "true".
Note, that in this example only one sequence can be analyzed at once.
Updatelog Genskew-univiecube
Version 0.1.3
general Bug fix
Version 0.1.2
fixed the SkewIT graph sometimes not displaying the right nucleotides on the axis label
Version 0.1.1
general Bug fix
updated the Readme
Version 0.1.0
Added the SkewIT¹ algorithm to GenSkew
Additional Parameter for plot_sequence
Version 0.0.9
First finnished release of Genskew
REFERENCES:
1: SkewIT, https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008439 (06.04.2022), SkewIT: The Skew Index Test for large scale GC Skew analysis of bacterial genomes, Jennifer Lu, Steven L. Salzberg
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file Genskew-univiecube-0.1.3.tar.gz
.
File metadata
- Download URL: Genskew-univiecube-0.1.3.tar.gz
- Upload date:
- Size: 5.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.11.3 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d61b1229fbd9de5d43351ada299a7f2db7dde70c293fe59d160a290b7be4eff7 |
|
MD5 | c3d488a3074e5a53e0c91e420d06eb57 |
|
BLAKE2b-256 | 57c36b676c822d0214291bbc3e27f8fb3d781dd219b37d9089efedbe9eaf71a8 |
File details
Details for the file Genskew_univiecube-0.1.3-py3-none-any.whl
.
File metadata
- Download URL: Genskew_univiecube-0.1.3-py3-none-any.whl
- Upload date:
- Size: 5.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.11.3 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 917350d4491955c62fc7095308663ee82cf1c78ceab8adf695e92fd1247aefe5 |
|
MD5 | d9948be56f8d0ff04cd5900ec29d8004 |
|
BLAKE2b-256 | a2c2adc1c8abea00d41ed8b284c4ca011db690713fbcc423316f6f4fea50ed5e |