Sorts indels into mutational classes
Project description
- Sorts indels into classes defined as follows:
homopolymer run (HR): mutation is in a region where there are 6 or more copies of the nucleotide being inserted or deleted
change in copy count (CCC): the allele being inserted or deleted has 1 or more repeats in the mutation region
no change in copy count (non-CCC): the allele being inserted or deleted is not repeated in the mutation region
- In order to use sorting_hat, you must ensure the following are installed:
To install, use pip:
pip install sorting_hat
Example run
sorting_hat --bed test.bed \
--fasta test.fasta \
--repeat repeat_masker.txt
Usage
sorting_hat [-h] -b BED -f FASTA -r REPEAT [-o OUTPUT]
Sorts indels into mutational classes
- -b BED, --bed BED
Location of BED file with all variants. Must be formatted as Chrom/Start/End/Ref/Alt/PatientID.
- -f FASTA, --fasta FASTA
Location of reference fasta file.
- -r REPEAT, --repeat REPEAT
Location of RepeatMasker file downloaded from UCSC Genome Browser. Refer to docs to see how to download RepeatMasker.
- -o OUTPUT, --output OUTPUT
Name of output file, if not chosen then will print to stdout.
To download RepeatMasker from UCSC Genome Browser, see photos in ‘data’ folder on github: https://github.com/allisonseiden/sorting_hat
Allison Seiden <ahseiden@gmail.com>
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for sorting_hat-0.0.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 017d6e9baee75527838d3a73727438401f6531b8723d710b3d4359a00920c5d7 |
|
MD5 | 9e61a08920a1b3e8e470496a17224b32 |
|
BLAKE2b-256 | bc41571ae11ec797637f71ee74bf27ad3eadff2b66e4b89a14838d64263e9ece |