Chunk and scatter the regions in a bed or sequence dictfile
Project description
chunked_scatter
This tool takes a bed file or sequence dictionary as input and divides the contigs/chromosomes into overlapping chunks of a given size. These chunks will then be placed in new bed files, one chromosomes per file. Small chromosomes will be put together to avoid the creation of thousands of files.
Installation
Install from github:
- Clone the repository:
git clone https://github.com/biowdl/chunked-scatter.git
- Enter the repository:
cd chunked-scatter
- Install using pip:
pip install .
Usage
chunked-scatter -p output_prefix -i input.bed
The input is expected to end in .bed
or .dict
!
option | arguments | definition |
---|---|---|
-c | a number | The size of the chunks. |
-o | a number | The size of the overlap. |
-m | a number | The minimum number of bases to be put in a single output file, before a new scatter will be made. |
Examples
bed file
Given a bed file located at /data/regions.bed
:
chr1 100 1000
chr1 2000 16000
chr2 5000 10000
The command:
chunked-scatter -p /data/scatter_ -i /data/regions.bed -m 1000 -c 5000
Will produce the following two output files:
/data/scatter_0.bed
:chr1 100 1000 chr1 2000 7000 chr1 6850 12000 chr1 11850 16000
/data/scatter_1.bed
:chr2 5000 10000
dict file
Given a dict file located at /data/ref.dict
:
@SQ SN:chr1 LN:3000000
@SQ SN:chr2 LN:500000
The command:
chunked-scatter -p /data/scatter_ -i /data/regions.bed
Will produce the following output file at /data/scatter_0.bed
:
chr1 0 1000000
chr1 999850 2000000
chr1 1999850 3000000
chr2 0 500000
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for chunked_scatter-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7e5f3af742d7e0fb97b29b9f3cd29342581e6f892c72d0bdfef80708220fd7f6 |
|
MD5 | f60a2faf7ae95438452025f5d8afc72c |
|
BLAKE2b-256 | b7bed660f4764eff0bf54cc399b734044fb6d87559cd691ec68c6c792bbc04d8 |