Sorts indels into mutational classes
Project description
- Sorts indels into classes defined as follows:
homopolymer run (HR): mutation is in a region where there are 6 or more copies of the nucleotide being inserted or deleted
change in copy count (CCC): the allele being inserted or deleted has 1 or more repeats in the mutation region
no change in copy count (non-CCC): the allele being inserted or deleted is not repeated in the mutation region
- In order to use sorting_hat, you must ensure the following are installed:
To install, use pip:
pip install sorting_hat
Example run
sorting_hat --bed test.bed \
--fasta test.fasta \
--repeat repeat_masker.txt
Usage
sorting_hat [-h] -b BED -f FASTA -r REPEAT [-o OUTPUT]
Sorts indels into mutational classes
- -b BED, --bed BED
Location of BED file with all variants. Must be formatted as Chrom/Start/End/Ref/Alt/PatientID.
- -f FASTA, --fasta FASTA
Location of reference fasta file.
- -r REPEAT, --repeat REPEAT
Location of RepeatMasker file downloaded from UCSC Genome Browser. Refer to docs to see how to download RepeatMasker.
- -o OUTPUT, --output OUTPUT
Name of output file, if not chosen then will print to stdout.
To download RepeatMasker from UCSC Genome Browser, see photos in ‘data’ folder on github: https://github.com/allisonseiden/sorting_hat
Allison Seiden <ahseiden@gmail.com>
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for sorting_hat-0.0.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 96a256d10675dc8ce6fcdfe39e3acd07f72a8547a5166bff78645c73327169b6 |
|
MD5 | 987f6ea5a83b28bbbb32424faa20b2bd |
|
BLAKE2b-256 | 9ef73c988c8445981a0a86fa9aec07ffca0dd5f8cd43d66f67cd59e01f66737f |