ZSeeker is a cli tool to find the propensity of B-DNA to form Z-DNA structures.
Project description
ZSeeker
==============
Installation
pip install ZSeeker
CLI Usage
ZSeeker --fasta ./test_GCA_f.fasta --n_jobs 1
Example: In Code usage
from zseeker.zdna_calculator import ZDNACalculatorSeq, Params
# Define parameters
params = Params(
GC_weight=5.0,
AT_weight=0.5,
GT_weight=1.1,
AC_weight=1.3,
mismatch_penalty_starting_value=4,
mismatch_penalty_linear_delta=2,
mismatch_penalty_type='linear',
threshold=10,
consecutive_AT_scoring=[1, 2, 2],
display_sequence_score=1
drop_threshold=50,
total_sequence_scoring=False
)
# Create a ZDNACalculatorSeq instance and nput sequence
zdna_calculator = ZDNACalculatorSeq(data="ACGTACGTACGT", params=params)
# Calculate subarrays above threshold
subarrays = zdna_calculator.subarrays_above_threshold()
# Print results
print(subarrays)
Command-line Help
usage: ZSeeker [-h] [--fasta FASTA] [--GC_weight GC_WEIGHT]
[--AT_weight AT_WEIGHT] [--GT_weight GT_WEIGHT]
[--AC_weight AC_WEIGHT]
[--mismatch_penalty_starting_value MISMATCH_PENALTY_STARTING_VALUE]
[--mismatch_penalty_linear_delta MISMATCH_PENALTY_LINEAR_DELTA]
[--mismatch_penalty_type {linear,exponential}]
[--n_jobs N_JOBS] [--threshold THRESHOLD]
[--consecutive_AT_scoring CONSECUTIVE_AT_SCORING]
[--display_sequence_score {0,1}]
[--output_dir OUTPUT_DIR]
[--gff_file GFF_FILE]
[--drop_threshold DROP_THRESHOLD]
[--total_sequence_scoring]
Given a fasta file and the corresponding parameters it calculates the
ZDNA for each sequence present.
options:
-h, --help show this help message and exit
--fasta FASTA Path to file analyzed
--GC_weight GC_WEIGHT
Weight given to GC and CG transitions.
Default = 7.0
--AT_weight AT_WEIGHT
Weight given to AT and TA transitions.
Default = 0.5
--GT_weight GT_WEIGHT
Weight given to GT and TG transitions.
Default = 1.25
--AC_weight AC_WEIGHT
Weight given to AC and CA transitions.
Default = 1.25
--mismatch_penalty_starting_value MISMATCH_PENALTY_STARTING_VALUE
Penalty applied to the first non
purine/pyrimidine transition encountered.
Default = 3
--mismatch_penalty_linear_delta MISMATCH_PENALTY_LINEAR_DELTA
Only applies if penalty type is set to
linear. Determines the rate of increase of
the penalty for every subsequent non
purine/pyrimidine transition. Default = 3
--mismatch_penalty_type {linear,exponential}
Method of scaling the penalty for contiguous
non purine/pyrimidine transition. Default =
linear
--n_jobs N_JOBS Number of threads to use. Defaults to -1,
which uses the maximum available threads on
CPU
--threshold THRESHOLD
Scoring threshold for a for a sequence to be
considered potentially Z-DNA forming and
returned by the program. This parameter is
also used for determining how big the scoring
drop within a sequence should be, before it
is split into two separate Z-DNA candidate
sequences. Default=50
--consecutive_AT_scoring CONSECUTIVE_AT_SCORING
Consecutive AT repeats form a hairpin
structure instead of Z-DNA. In order to
reflect that, a penalty array is defined,
which provides the score adjustment for the
first and the subsequent TA appearances. The
last element will be applied to every
subsequent TA appearance. For more
information see documentation. Default =
(0.5, 0.5, 0.5, 0.5, 0.0, 0.0, -5.0, -100.0)
--display_sequence_score {0,1}
--output_dir OUTPUT_DIR
--gff_file GFF_FILE Optional GFF file for gene annotation. Only 'gene' features are used.
--drop_threshold DROP_THRESHOLD
Drop threshold used within subarrays
detection logic. Default = 50.
--total_sequence_scoring
If set, compute only a single
transitions-based total score per
sequence (one row each). Skips subarray
detection altogether.
Example output file
| Chromosome | Start | End | Z-DNA Score | Sequence |
|---|---|---|---|---|
| Z1 | 0.0 | 15.0 | 87.0 | TGCGTGCGCGCGCGCG |
| Z2 | 0.0 | 15.0 | 87.0 | GCGCCCGCGCGCGCGC |
| Z3 | 0.0 | 11.0 | 71.0 | GCGCGCGCGCGT |
| Z4 | 0.0 | 11.0 | 65.0 | GCGCGTGCGCGC |
| Z5 | 0.0 | 10.0 | 70.0 | CGCGCGCGCGC |
| Z6 | 0.0 | 15.0 | 63.0 | GCACGCACACGCGCGT |
| Z7 | 0.0 | 10.0 | 70.0 | GCGCGCGCGCG |
| Z8 | 0.0 | 13.0 | 61.0 | CGCACGCGCACGCA |
| Z9 | 0.0 | 11.0 | 59.0 | CGCGCGCGCACA |
Example output file with annotations
| Chromosome | Start | End | Z-DNA Score | Sequence | gene_start | gene_end | gene_id | gene_biotype | strand | distance | distance_from_TSS | distance_from_TES |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| AE004438.1 | 364 | 391 | 63.0 | ACGGTGCCGCAGCGGCCGTGTCGCCAGC | 362 | 812 | gene-VNG_6001H | protein_coding | - | 0 | 420 | 2 |
| AE004438.1 | 2317 | 2335 | 51.5 | GCGGCGAGTCGCCGTCGCG | 1904 | 3719 | gene-VNG_6007H | protein_coding | - | 0 | 1383 | 413 |
| AE004438.1 | 3528 | 3538 | 52.75 | ACGTGCGCGCG | 1904 | 3719 | gene-VNG_6007H | protein_coding | - | 0 | 180 | 1624 |
| AE004438.1 | 12771 | 12814 | 109.25 | GCTGTCGCTGTCGGCGGCGGCTGCCGCCGACGCGACAGCGTCGC | 12846 | 13380 | gene-VNG_6015H | protein_coding | - | 32 | 565 | 32 |
| AE004438.1 | 13178 | 13195 | 56.0 | ACGGCGCGTCAGCGGCGT | 12846 | 13380 | gene-VNG_6015H | protein_coding | - | 0 | 184 | 332 |
| AE004438.1 | 13533 | 13552 | 52.75 | ACGGCGCACCGCCAGCGTGT | 12846 | 13380 | gene-VNG_6015H | protein_coding | - | 153 | 154 | 687 |
| AE004438.1 | 13853 | 13872 | 70.0 | CGTCGGCGCACGCGCCGACG | 14307 | 15582 | gene-VNG_6016H | protein_coding | + | 435 | 435 | 1709 |
| AE004438.1 | 14960 | 14971 | 51.25 | GCGCGGTCGCGC | 14307 | 15582 | gene-VNG_6016H | protein_coding | + | 0 | 653 | 610 |
| AE004438.1 | 15105 | 15126 | 61.0 | CGCGTCGTCGGCGTCCGCGACG | 14307 | 15582 | gene-VNG_6016H | protein_coding | + | 0 | 798 | 455 |
ZSeeker web application
The web version of ZSeeker can be found at: ZSeeker web application
And a dockerized version of it can be found at this repository for local deployments: ZSeeker web application dockerized
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
zseeker-1.7.tar.gz
(25.5 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
ZSeeker-1.7-py3-none-any.whl
(26.2 kB
view details)
File details
Details for the file zseeker-1.7.tar.gz.
File metadata
- Download URL: zseeker-1.7.tar.gz
- Upload date:
- Size: 25.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.10.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
49ff68784a5d2ffb37f0f0776854d20e85f86a359c18f23715ebc543153f1c6e
|
|
| MD5 |
22e8fbf9f1b0226033828e505fbd6d35
|
|
| BLAKE2b-256 |
800d611dfb00c14d2ecb4d095aab0c1cfb7cbd4fe5c8500183c23fe24afccede
|
File details
Details for the file ZSeeker-1.7-py3-none-any.whl.
File metadata
- Download URL: ZSeeker-1.7-py3-none-any.whl
- Upload date:
- Size: 26.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.10.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
49fa60230025f1f7e76930e505db95ddba009406bdbaf5f6500d7200a8637fd9
|
|
| MD5 |
971fd1a6e743c0843028de34c5e87df4
|
|
| BLAKE2b-256 |
159ae3823ebdde96d4e3f975ade9197e700b07f88bc7f7e5f5b12b95d2d394fb
|