Mutational signatures attribution and decomposition tool
Project description
SigProfilerAssignment
SigProfilerAssignment is a new mutational attribution and decomposition tool that performs the following functions:
- Attributing a known set of mutational signatures to an individual sample or multiple samples.
- Decomposing de novo signatures to COSMIC signature database.
- Attributing COSMIC database or a custom signature database to given samples.
The tool identifies the activity of each signature in the sample and assigns the probability for each signature to cause a specific mutation type in the sample. The tool makes use of SigProfilerMatrixGenerator, SigProfilerExtractor and SigProfilerPlotting.
Installs
for installing from PyPi in new conda environment
$ pip install SigProfilerAssignment
Installing this package : git clone this repo or download the zip file. Unzip the contents of SigProfilerExtractor-master.zip or the zip file of a corresponding branch.
$ cd SigProfilerAssignment-master
$ pip install .
Signature Subtypes
signature_subgroups = ['remove_MMR_deficiency_signatures',
'remove_POL_deficiency_signatures',
'remove_HR_deficiency_signatures' ,
'remove_BER_deficiency_signatures',
'remove_Chemotherapy_signatures',
'remove_Immunosuppressants_signatures'
'remove_Iatrogenic_signatures'
'remove_APOBEC_signatures',
'remove_Tobacco_signatures',
'remove_UV_signatures',
'remove_AA_signatures',
'remove_Colibactin_signatures',
'remove_Artifact_signatures',
'remove_Lymphoid_signatures']
Signature Subgroup | SBS Signatures that are excluded |
---|---|
MMR_deficiency_signatures | 6, 14, 15, 20, 21, 26, 44 |
POL_deficiency_signatures | 10a, 10b, 10c, 10d, 28 |
HR_deficiency_signatures | 3 |
BER_deficiency_signatures | 30, 36 |
Chemotherapy_signatures | 11, 25, 31, 35, 86, 87, 90 |
Immunosuppressants_signatures | 32 |
Iatrogenic_signatures | 11, 25, 31, 32, 35, 86, 87, 90 |
APOBEC_signatures | 2, 13 |
Tobacco_signatures | 4, 29, 92 |
UV_signatures | 7a, 7b, 7c, 7d, 38 |
AA_signatures | 22 |
Colibactin_signatures | 88 |
Artifact_signatures | 27, 43, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60 |
Lymphoid_signatures | 9, 84, 85 |
Decompose Fit
Decomposes the De Novo Signatures into COSMIC Signatures and assigns COSMIC signatures into samples.
from SigProfilerAssignment import Analyzer as Analyze
Analyze.decompose_fit(samples,
output,
signatures=signatures,
signature_database=sigs,
genome_build="GRCh37",
verbose=False,
new_signature_thresh_hold=0.8,
signature_subgroups=signature_subgroups)
De Novo Fit
Attributes mutations of given Samples to input denovo signatures.
from SigProfilerAssignment import Analyzer as Analyze
Analyze.denovo_fit( samples,
output,
signatures=signatures,
signature_database=sigs,
genome_build="GRCh37",
verbose=False)
COSMIC Fit
Attributes mutations of given Samples to input COSMIC signatures. Note that penalties associated with denovo fit and COSMIC fits are different.
from SigProfilerAssignment import Analyzer as Analyze
Analyze.cosmic_fit( samples,
output,
signatures=None,
signature_database=sigs,
genome_build="GRCh37",
verbose=False,
collapse_to_SBS96=False,
signature_subgroups=signature_subgroups,
make_plots=True)
Main Parameters
Parameter | Variable Type | Parameter Description |
---|---|---|
signatures | String | Path to a tab delimited file that contains the signaure table where the rows are mutation types and colunms are signature IDs. |
activities | String | Path to a tab delimilted file that contains the activity table where the rows are sample IDs and colunms are signature IDs. |
samples | String | Path to a tab delimilted file that contains the activity table where the rows are mutation types and colunms are sample IDs. |
output | String | Path to the output folder. |
genome_build | String | The genome type. Example: "GRCh37", "GRCh38", "mm9", "mm10". The default value is "GRCh37" |
new_signature_thresh_hold | Float | Parameter in Cosine similarity to declare a new signature. Applicable for decompose fit only. The default value is 0.8 |
make_plots | Boolean | Toggle on and off for making and saving all plots. Default value is True. |
signature_subgroups | List | Removes the signatures corresponding to specific subtypes for better fitting. The usage is given above. Default value is None. |
verbose | Boolean | Prints statements. Default value is False. |
SPA analysis Example
#import modules
import SigProfilerAssignment as spa
from SigProfilerAssignment import Analyzer as Analyze
#set directories and paths to signatures and samples
dir_inp = spa.__path__[0]+'/data/Examples/'
signatures = dir_inp+"Results_scenario_8/SBS96/All_Solutions/SBS96_3_Signatures/Signatures/SBS96_S3_Signatures.txt"
activities = dir_inp+"Results_scenario_8/SBS96/All_Solutions/SBS96_3_Signatures/Activities/SBS96_S3_NMF_Activities.txt"
samples = dir_inp+"Input_scenario_8/Samples.txt"
output = "output_example/"
sigs = "COSMIC_v3_SBS_GRCh37_noSBS84-85.txt" #Custom Signature Database
#Analysis of SP Assignment
Analyze.cosmic_fit( samples,
output,
signatures=None,
signature_database=sigs,
genome_build="GRCh37",
verbose=False,
collapse_to_SBS96=False,
signature_subgroups=signature_subgroups,
make_plots=True)
Copyright
This software and its documentation are copyright 2022 as a part of the SigProfiler project. The SigProfilerAssignment framework is free software and is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
Contact Information
Please address any queries or bug reports to Raviteja Vangara at rvangara@health.ucsd.edu
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for SigProfilerAssignment-0.0.8.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | cdc6962e00396fedb153e494af491afb7d3e2412f499c23cc27299648eb7f96f |
|
MD5 | 7b17165091094731f596be56c07efcae |
|
BLAKE2b-256 | dd2fb1f51b32baa819c71f8f872cf621fa6a3b60c7bb151f19b2afee6acf89a2 |
Hashes for SigProfilerAssignment-0.0.8-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8925b5647e80a32fdef877b6065d929a97ab348b7cbaaec502262ee2a5b2c8aa |
|
MD5 | 57823162f8fef9670cbf4dc8e893d800 |
|
BLAKE2b-256 | b5e6bf7281f0f954a6108054031ca43038d77c11074c97615710dfe0d6153edb |