Sorts indels into mutational classes
Project description
- Sorts indels into classes defined as follows:
homopolymer run (HR): mutation is in a region where there are 6 or more copies of the nucleotide being inserted or deleted
change in copy count (CCC): the allele being inserted or deleted has 1 or more repeats in the mutation region
no change in copy count (non-CCC): the allele being inserted or deleted is not repeated in the mutation region
- In order to use sorting_hat, you must ensure the following are installed:
To install, use pip:
pip install sorting_hat
Example run
sorting_hat --bed test.bed \
--fasta test.fasta \
--repeat repeat_masker.txt
Usage
sorting_hat [-h] -b BED -f FASTA -r REPEAT [-o OUTPUT]
Sorts indels into mutational classes
- -b BED, --bed BED
Location of BED file with all variants. Must be formatted as Chrom/Start/End/Ref/Alt/PatientID.
- -f FASTA, --fasta FASTA
Location of reference fasta file.
- -r REPEAT, --repeat REPEAT
Location of RepeatMasker file downloaded from UCSC Genome Browser. Refer to docs to see how to download RepeatMasker.
- -o OUTPUT, --output OUTPUT
Name of output file, if not chosen then will print to stdout.
To download RepeatMasker from UCSC Genome Browser:
Allison Seiden <ahseiden@gmail.com>
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for sorting_hat-0.0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fea17eec9e1f0f15ee860b7d8f16ded25b3d1a933668c27f82af78d08cad95cd |
|
MD5 | 9a77b3115e6ebaa109e244887d48dac4 |
|
BLAKE2b-256 | a2847481aad48f61a690b566ef7a78a61519359e54f456322f6cd646dc67622d |