Sorts indels into mutational classes
Project description
- Sorts indels into classes defined as follows:
homopolymer run (HR): mutation is in a region where there are 6 or more copies of the nucleotide being inserted or deleted
change in copy count (CCC): the allele being inserted or deleted has 1 or more repeats in the mutation region
no change in copy count (non-CCC): the allele being inserted or deleted is not repeated in the mutation region
- In order to use sorting_hat, you must ensure the following are installed:
To install, use pip:
pip install sorting_hat
Example run
sorting_hat --bed test.bed \
--fasta test.fasta \
--repeat repeat_masker.txt
Usage
sorting_hat [-h] -b BED -f FASTA -r REPEAT [-o OUTPUT]
Sorts indels into mutational classes
- -b BED, --bed BED
Location of BED file with all variants. Must be formatted as Chrom/Start/End/Ref/Alt/PatientID.
- -f FASTA, --fasta FASTA
Location of reference fasta file.
- -r REPEAT, --repeat REPEAT
Location of RepeatMasker file downloaded from UCSC Genome Browser. Refer to docs to see how to download RepeatMasker.
- -o OUTPUT, --output OUTPUT
Name of output file, if not chosen then will print to stdout.
To download RepeatMasker from UCSC Genome Browser, see photos in ‘data’ folder on github: https://github.com/allisonseiden/sorting_hat
Allison Seiden <ahseiden@gmail.com>
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for sorting_hat-0.0.5-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 423eef49d2df34b8c5dbfc2a60390fab549dceee068b7ef9c93bb27ca7234ce1 |
|
MD5 | 54b00f3081bf50c3e42209ed41348cd8 |
|
BLAKE2b-256 | f7a07f6642101023d28f9c8ba54b23b8dbda438593020e2bdd610d95fc39018f |