delfies is a tool for the detection of DNA Elimination breakpoints
Project description
Delfies
delfies
is a tool for the detection of DNA breakpoints with de-novo telomere addition.
It identifies genomic locations where double-strand breaks have occurred followed by telomere addition. It was initially designed and validated for studying the process of Programmed DNA Elimination in nematodes, but should work for other clades and applications too.
Getting started
delfies
takes as input a genome fasta (gzipped supported) and an indexed SAM/BAM of
sequencing reads aligned to the genome.
delfies --help
samtools index <aligned_reads>.bam
delfies <genome>.fa.gz <aligned_reads>.bam <output_dir>
cat <output_dir>/breakpoint_locations.bed
User Manual
Installation
Using pip
(or equivalent - poetry, etc.):
# Download and install a specific release
DELFIES_VERSION=0.6.0
wget "https://github.com/bricoletc/delfies/archive/refs/tags/${DELFIES_VERSION}.tar.gz"
tar -xf "delfies-${DELFIES_VERSION}.tar.gz
pip install ./delfies-"${DELFIES_VERSION}"/
# OR clone and install tip of main
git clone https://github.com/bricoletc/delfies/
pip install ./delfies
CLI options
delfies --help
- Do use the
--threads
option if you have multiple cores/CPUs available. - [Breakpoints]
- There are two types of breakpoints: see detailed docs.
- Nearby breakpoints can be clustered together to account for variability in breakpoint location (
--clustering_threshold
).
- [Region selection]: You can select a specific region to focus on, specified as a string or as a BED file.
- [Telomeres]
- Specify the telomere sequence for your organism using
--telo_forward_seq
. If you're unsure, I recommend the tool telomeric-identifier for finding out.
- Specify the telomere sequence for your organism using
- [Aligned reads]
- To analyse confidently-aligned reads only, you can filter reads by MAPQ (
--min_mapq
) and by bitwise flag (--read_filter_flag
). - You can tolerate more or less mutations in the telomere sequences (and in the reads) using
--telo_max_edit_distance
and--telo_array_size
.
- To analyse confidently-aligned reads only, you can filter reads by MAPQ (
Outputs
The two main outputs of delfies
are:
breakpoint_locations.bed
: a BED-formatted file containing the location of identified elimination breakpoints.breakpoint_sequences.fasta
: a FASTA-formatted file containing the sequences of identified elimination breakpoints
For more details on outputs, see detailed docs.
Applications
- The fasta output enables looking for sequence motifs that occur at breakpoints, e.g. using MEME.
- The BED output enables classifying a genome into retained and eliminated regions. The 'strand' of breakpoints is especially useful for this: see detailed docs.
- The BED output also enables assembling past somatic telomeres: for how to do this, see detailed docs.
Visualising your results
I highly recommend visualising your results!
E.g., by loading your input fasta and BAM and output delfies
' output breakpoint_locations.bed
in IGV.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file delfies-0.6.0.tar.gz
.
File metadata
- Download URL: delfies-0.6.0.tar.gz
- Upload date:
- Size: 13.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 67e8cda0ae8d1a28445e2ee5cf08181a13354c6b615bc50b57761474a683e275 |
|
MD5 | 1b6ed76ff8a1efb1dab119138020ac97 |
|
BLAKE2b-256 | eb6eec6caa37cd128f1f5828b3d4b062a2e261f63af713abf71979aa259a1bdf |
File details
Details for the file delfies-0.6.0-py3-none-any.whl
.
File metadata
- Download URL: delfies-0.6.0-py3-none-any.whl
- Upload date:
- Size: 14.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ddb3ee823100fc1c4d46d22c4bab2b771d3ddc44dddb8edd9bb5932c99fa3bc7 |
|
MD5 | d47d13e70d83ca928436dca63e666ba9 |
|
BLAKE2b-256 | 79e88a815f0fd005b853cb1e78570a76f79c19420e19c9290cb241c42c693313 |