Skip to main content

Package to manipulate mutational processes.

Project description

SigNet

SigNet is a package to study genetic mutational processes. Check out our theoretical background page for further information on this topic. As of now, it contains 3 solutions:

This is the official code implementation of the paper: Mutational signature decomposition with deep neural networks reveals origins of clock-like processes and hypoxia dependencies. By Claudia Serrano, Oleguer Canal, et all.

Readme contents

You can use SigNet in 3 different ways depending on your workflow:

  1. Python Package

    1. Python Package Installation
    2. Python Package Usage
  2. Command Line Interface (CLI)

  3. Source Code

    1. Downloading Source Code
    2. Code-Basics

Python Package

Recommended if you want to integrate SigNet as part of your python workflow, or intending to re-train models on custom data with limited ANN architectural changes. You can install the python package running:

pip install signaturesnet

Once installed, you can run Signet Refitter like so:

import pandas as pd
from signaturesnet.modules.signet_module import SigNet

# Read your mutational data
mutations = pd.read_csv("your_input", header=0, index_col=0)

# Load & Run signet
signet = SigNet(opportunities_name_or_path="your_normalization_file")
results = signet(mutation_dataset=mutations)

# Extract results
w, u, l, c, _ = results.get_output()

# Store results
results.save(path='Output', name="this_experiment_filename")

# Plot figures
results.plot_results(save=True)

For a more usage examples: Check out the examples folder:

NOTE: It is recommended that you work on a custom python virtualenvironment to avoid package version mismatches.

Command Line Interface

Recommended if only interested in running SigNet modules independently and not willing to retrain models or change the source code.
NOTE: This option is only tested on Debian-based Linux distributions. Steps:

  1. Download the signaturesnet exectuable
  2. Change directory to wherever you downloaded it: cd <wherever/you/downloaded/the/executable/>
  3. Make it executable by your user: sudo chmod u+x signaturesnet

Refitter:

The following example shows how to use SigNet Refitter.

cd <wherever/you/downloaded/the/executable/>
./signaturesnet refitter  [--input_format {counts, bed, vcf}]
                   [--input_data INPUTFILE]
                   [--reference_genome REFGENOME]
                   [--normalization {None, exome, genome, PATH_TO_ABUNDANCES}] 
                   [--only_nnls ONLYNNLS]
                   [--cutoff CUTOFF]
                   [--output OUTPUT]
                   [--plot_figs False]
  • --input_format: Name of the format of the input. The default is 'counts'. Please refer to Mutations Input for further details.

  • --input_data: Path to the file containing the mutational counts. Please refer to Mutations Input for further details.

  • --reference_genome: Name or path to the reference genome. Needed when input_format is bed or vcf.

  • --normalization: As the INPUTFILE contain counts, we need to normalize them according to the abundances of each trinucleotide on the genome region we are counting the mutations.

    • Choose None (default): If you don't want any normalization.
    • Choose exome: If the data that is being input comes from Whole Exome Sequencing. This will normalize the counts according to the trinucleotide abundances in the exome.
    • Choose genome: If the data comes from Whole Genome Sequencing.
    • Set a PATH_TO_ABUNDANCES to use a custom normalization file. Please refer to Normalization Input for further details on the input format.
  • --only_nnls: Whether to use NNLS mode only (the finetuner is not run). Default: False.

  • --cutoff: Cutoff to be applied to the final weights. Default: 0.01.

  • --output Path to the folder where all the output files (weights guesses and figures) will be stored. By default, this folder will be called "Output" and will be created in the current directory. Please refer to SigNet Refitter Output for further details on the output format.

  • --plot_figs Whether to generate output plots or not. Possible options are True or False.

Detector:

cd <wherever/you/downloaded/the/executable/>
./signaturesnet detector  [--input_data INPUTFILE]
                   [--normalization {None, exome, genome, PATH_TO_ABUNDANCES}] 
                   [--output OUTPUT]

(Same arguments as before)

Generator:

cd <wherever/you/downloaded/the/executable/>
./signaturesnet generator  [--n_datapoints INT]
                    [--output OUTPUT]
  • --n_datapoints: Number of signature weight combinations to generate.

Source Code

Is the option which gives more flexibility. Recommended if you want to play around with the code, re-train custom models or do contributions.

Downloading Source Code

Clone the repo and install it as an editable pip package like so:

git clone git@github.com:weghornlab/SigNet.git
cd SigNet
pip install -e .

Refer here for the project code organization.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

signaturesnet-0.1.1.tar.gz (35.5 MB view details)

Uploaded Source

Built Distribution

signaturesnet-0.1.1-py3-none-any.whl (35.5 MB view details)

Uploaded Python 3

File details

Details for the file signaturesnet-0.1.1.tar.gz.

File metadata

  • Download URL: signaturesnet-0.1.1.tar.gz
  • Upload date:
  • Size: 35.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.8.10

File hashes

Hashes for signaturesnet-0.1.1.tar.gz
Algorithm Hash digest
SHA256 f0b325ae8401d14df7cab0e1dca06e744dce9ae4fadd8152993ef8511cc00fc8
MD5 5b843e5986f2704578b36a4c9619878b
BLAKE2b-256 0ec782bb23c3fb1b95cc764c5cfecf011a77a3c66914a9c16b2b7cd1cbc46228

See more details on using hashes here.

File details

Details for the file signaturesnet-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for signaturesnet-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7d0327ace8a4936b4e4527b00d62fed1f8f927157a6cdd9e5615149f9201ed10
MD5 8cfe94143ec50161d1dcecb51001101f
BLAKE2b-256 df4d71b2bd59c95395977ea8ceb63acec5276cd7e55aae4de1f1e4fdf4913c18

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page