Mutational signatures attribution and decomposition tool
Project description
SigProfilerAssignment
SigProfilerAssignment is a new mutational attribution and decomposition tool that performs the following functions:
- Attributing a known set of mutational signatures to an individual sample or multiple samples.
- Decomposing de novo signatures to COSMIC signature database.
- Attributing COSMIC database or a custom signature database to given samples.
The tool identifies the activity of each signature in the sample and assigns the probability for each signature to cause a specific mutation type in the sample. The tool makes use of SigProfilerMatrixGenerator, SigProfilerExtractor and SigProfilerPlotting.
Installs
for installing from PyPi in new conda environment
$ pip install SigProfilerAssignment
Installing this package : git clone this repo or download the zip file. Unzip the contents of SigProfilerExtractor-master.zip or the zip file of a corresponding branch.
$ cd SigProfilerAssignment-master
$ pip install .
Signature Subtypes
signature_subgroups = ['remove_MMR_deficiency_signatures',
'remove_POL_deficiency_signatures',
'remove_HR_deficiency_signatures' ,
'remove_BER_deficiency_signatures',
'remove_Chemotherapy_signatures',
'remove_Immunosuppressants_signatures'
'remove_Treatment_signatures'
'remove_APOBEC_signatures',
'remove_Tobacco_signatures',
'remove_UV_signatures',
'remove_AA_signatures',
'remove_Colibactin_signatures',
'remove_Artifact_signatures',
'remove_Lymphoid_signatures']
Signature Subgroup | SBS Signatures that are excluded |
---|---|
MMR_deficiency_signatures | 6, 14, 15, 20, 21, 26, 44 |
POL_deficiency_signatures | 10a, 10b, 10c, 10d, 28 |
HR_deficiency_signatures | 3 |
BER_deficiency_signatures | 30, 36 |
Chemotherapy_signatures | 11, 25, 31, 35, 86, 87, 90 |
Immunosuppressants_signatures | 32 |
Treatment_signatures | 11, 25, 31, 32, 35, 86, 87, 90 |
APOBEC_signatures | 2, 13 |
Tobacco_signatures | 4, 29, 92 |
UV_signatures | 7a, 7b, 7c, 7d, 38 |
AA_signatures | 22 |
Colibactin_signatures | 88 |
Artifact_signatures | 27, 43, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60 |
Lymphoid_signatures | 9, 84, 85 |
Decompose Fit
Decomposes the De Novo Signatures into COSMIC Signatures and assigns COSMIC signatures into samples.
from SigProfilerAssignment import Analyzer as Analyze
Analyze.decompose_fit(samples,
output,
signatures=signatures,
signature_database=sigs,
genome_build="GRCh37",
verbose=False,
new_signature_thresh_hold=0.8,
signature_subgroups=signature_subgroups,
exome=False)
De Novo Fit
Attributes mutations of given Samples to input denovo signatures.
from SigProfilerAssignment import Analyzer as Analyze
Analyze.denovo_fit( samples,
output,
signatures=signatures,
signature_database=sigs,
genome_build="GRCh37",
verbose=False)
COSMIC Fit
Attributes mutations of given Samples to input COSMIC signatures. Note that penalties associated with denovo fit and COSMIC fits are different.
from SigProfilerAssignment import Analyzer as Analyze
Analyze.cosmic_fit( samples,
output,
signatures=None,
signature_database=sigs,
genome_build="GRCh37",
verbose=False,
collapse_to_SBS96=False,
make_plots=True,
signature_subgroups=signature_subgroups,
exome=False
)
Main Parameters
Parameter | Variable Type | Parameter Description |
---|---|---|
samples | String | Path to a tab delimilted file that contains the samples table where the rows are mutation types and colunms are sample IDs. |
output | String | Path to the output folder. |
signatures | String | Path to a tab delimited file that contains the signature table where the rows are mutation types and colunms are signature IDs. |
genome_build | String | The reference genome build. List of supported genomes: "GRCh37", "GRCh38", "mm9", "mm10" and "rn6". The default value is "GRCh37". If the selected genome is not in the supported list, the default genome will be used. |
cosmic_version | Float | Takes a positive float among 1, 2, 3, 3.1, 3.2 and 3.3. Defines the version of the COSMIC reference signatures. The default value is 3.3. |
new_signature_thresh_hold | Float | Parameter in cosine similarity to declare a new signature. Applicable for decompose_fit only. The default value is 0.8. |
make_plots | Boolean | Toggle on and off for making and saving all plots. Default value is True. |
signature_subgroups | List | Removes the signatures corresponding to specific subtypes for better fitting. The usage is given above. Default value is None. |
exome | Boolean | Defines if the exome renormalized signatures will be used. The default value is False. |
verbose | Boolean | Prints statements. Default value is False. |
SPA analysis Example
#import modules
import SigProfilerAssignment as spa
from SigProfilerAssignment import Analyzer as Analyze
#set directories and paths to signatures and samples
dir_inp = spa.__path__[0]+'/data/Examples/'
samples = dir_inp+"Input_scenario_8/Samples.txt"
output = "output_example/"
signatures = dir_inp+"Results_scenario_8/SBS96/All_Solutions/SBS96_3_Signatures/Signatures/SBS96_S3_Signatures.txt"
sigs = "COSMIC_v3_SBS_GRCh37_noSBS84-85.txt" #Custom Signature Database
#Analysis of SP Assignment
Analyze.cosmic_fit( samples,
output,
signatures=None,
signature_database=sigs,
genome_build="GRCh37",
cosmic_version=3.3,
verbose=False,
collapse_to_SBS96=False,
make_plots=True,
signature_subgroups=None,
exome=False)
Copyright
This software and its documentation are copyright 2022 as a part of the SigProfiler project. The SigProfilerAssignment framework is free software and is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
Contact Information
Please address any queries or bug reports to Raviteja Vangara at rvangara@health.ucsd.edu
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for SigProfilerAssignment-0.0.10.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1bebcfe15af21bc5018b73e3a33c8ed9d8d16337dc8ea47c30f5e429b48d3d3f |
|
MD5 | f36c36b098b8f8b9a11ff75b3964b983 |
|
BLAKE2b-256 | efe2fb950814a244e3783cd12bba4337ead95882b7adbf8b6100faae9281d8e8 |
Hashes for SigProfilerAssignment-0.0.10-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 70ea2d01d6cd6c855e852cc98a94fe561ec06f256b54f32edc9ecc8f531f51fb |
|
MD5 | 87fed11b460a31ef5f8091b1fc13da01 |
|
BLAKE2b-256 | 377a96090afea439df630b92520cef7d519d8acda4727d2d36f54cd4e8f734b6 |