Skip to main content

No project description provided

Project description

mEdit

GitHub Downloads (all assets, all releases) GitHub last commit (branch) GitHub top language PyPI - Version GitHub License

Table of Contents

What is mEdit?

Program Structure

screenshot

Features

  • Reference Human Genome
    • mEdit uses the RefSeq human genome reference GRCh38.p14
    • Alternatively, the user can provide a custom human assembly. [See db_set for details]
  • Alternative Genomes
    • mEdit can work with alternative genomes which are compared to the reference assembly
    • Pangenomes made public by the HPRC are built into mEdit and can be included in the analysis in 'standard' mode
  • Flexible editing tool selection
    • Several endonucleases and base-editors are built into mEdit and can be requested in any combination. [See options in guide_prediction].
    • Custom editing tools can also be ingested by mEdit. [See how to format custom editors in guide_prediction]

Getting Started

Prerequisites

PIP

  • Make sure gcc is installed
    sudo apt install gcc
    
  • Also make sure your pip up to date
    python -m pip install --upgrade pip
    
    • or:
    apt install python3-pip
    

Anaconda

  • mEdit utilizes Anaconda to build its own environments under the hood.
  • Install Miniconda:
    bash Miniconda3-latest-<your-OS>.sh
    
  • Set up and update conda:
    conda update --all
    conda config --set channel_priority strict
    

Mamba

  • The officially supported way of installing Mamba is through Miniforge.
  • The Miniforge repository holds the minimal installers for Conda and Mamba specific to conda-forge.

Installation

  • mEdit is compatible with UNIX-based systems running on Intel processors and it's conveniently available via pyPI:
pip install meditability

Running Tests

  • As a Snakemake-based application, mEdit supports dry runs.
  • A dry run evaluates the presence of supporting data, and I/O necessary for each process
  • All mEdit programs can be used called with the --dry option

Usage

  • To obtain information on how to run mEdit and view its programs, simply execute with the —-help flag
 medit —-help
  • There are four programs available in the current version
    • db_set: Set up the necessary background data to run mEdit. This downloads ~7GB of data.
    • list: Prints the current set of editors available on mEdit.
    • guide_prediction: This program scans for potential guides for variants specified on the input by searching a diverse set of editors.
    • offtarget: Predicts off-target effect for the guides found

1. Database Setup

$ mEdit db_set [-h] [-d DB_PATH] [-l] [-c CUSTOM_REFERENCE] [-t THREADS]
  • Database Setup is used to retrieve the required information and datasets to run medit. The contents include the reference human genome, HPRC pangenome vcf files, Refseq, MANE, clinvar and more. See the database structure below.
Args Description
-d DB_PATH Provide the path where the "mEdit_database" directory will be created ahead of the analysis. Requires ~7GB in-disk storage [default: ./mEdit_database]
-l Request the latest human genome reference as part of mEdit database unpacking. This is especially recommended when running predictions on private genome assemblies. [default: False]
-c CUSTOM_REFERENCE Provide the path to a custom human reference genome in FASTA format. ***Chromosome annotation must follow a ">chrN" format (case sensitive)
-t THREADS Provide the number of cores for parallel decompression of mEdit databases.

2. Editor List

mEdit list [-h] [-d DB_PATH]

Currently in version 0.2.8, there are 24 endonuclease editors and 29 base editor stored within medit. list prints out a list of both base editors and endonuclease editors with the parameters used for guide prediction.

Output;

Available endonuclease editors:  
-----------------------------  
name: spCas9  
pam, pam_is_first: NGG, False  
guide_len: 20  
dsb_position: -3  
notes: requirements work for SpCas9-HF1, eSpCas9 1.1,spyCas9  
5'-xxxxxxxxxxxxxxxxxxxxNGG-3'  
-----------------------------
Args Description
-d DB_PATH Provide the path where the "mEdit_database" the directory was created ahead of the analysis using the "db_set" program. [default: ./mEdit_database]

3. Guide Prediction

guide_prediction is the main program to search for guides given a list of variants. The pathogenic variants wished to be searched can be either from the clinvar database or a de novo variant. medit first generates variant incorporated gRNAs using the reference human genome. If the user chooses ”fast” the search will end with the human reference genome. However if the user chooses “standard” or “vcf” the medit program will also go on to predict the impact of alternative genomic variants on either the pangenome or user provided vcf file.

Outputs;
FAST : A guide report table(s) of the variant editable guides derived from the human reference genome. A gene table and a clinically relevant table based on the search.

STANDARD: The output given by FAST, as well as a summary of variants found near the target sites identified in the pangenome assemblies and a guide report with guides impacted (

VCF: The same results as the FAST search as well as

Required Input
Args Description
-i QUERY_INPUT Path to plain text file containing the query (or set of queries) of variant(s) for mEdit analysis. See --qtype for formatting options.
-o OUTPUT Path to root directory where mEdit outputs will be stored [default: mEdit_analysis_<jobtag>/]
-d DB_PATH Provide the path where the "mEdit_database" directory was created ahead of the analysis using the "db_set" program.[default: ./mEdit_database]
-j JOBTAG Provide the tag associated with the current mEdit job. mEdit will generate a random jobtag by default
-m {fast,standard,vcf} The MODE option determines how mEdit will run your job.[default = "standard"] [1-] "fast": will find and process guides based only on one reference human genome. [2-] "standard": will find and process guides based on a reference human genome assembly along with a diverse set of pangenomes from HPRC. [3-] "vcf": will find and process guides based only on reference human genome and a given vcf file. requires a private VCF file that will be processed for guide prediction.
-v CUSTOM_VCF Provide a gunzip compressed VCF file to run mEdit’s vcf mode
--qtype {hgvs,coord, gene, rsid} Set the query type provided to mEdit. [default = "hgvs"] [1-] "hgvs": must at least contain the Refseq identifier followed by “:” and the commonly used HGVS nomenclature. Example: NM_000518.5:c.114G>A [2-] "coord": must contain hg38 coordinates followed by (ALT>REF). Alleles must be the plus strand.Example: chr11:5226778C>T [3-]”gene”: Gene name [4-]”rsid”: dbSNP ID
Optional Arguments
--editor editor_request {clinical, user_define_list, custom} Delimits the set of editors to be used by mEdit. [default = "clinical"] Use the "medit list" prompt to access the arrays of editors currently supported in each category. [1-] "clinical": a short list of clinically relevant editors that are either in pre-clinical or clinical trials. [2-] "user_defined_list": - one more editors chosen from, comma-separated list chosen from the “medit list” of editors [3-] "custom": select guide search parameters. This requires a separate input of parameters : ‘pam’, ‘pamISfirst’,’guidelen’,’dsb_pos
--be {off,default, custom,user defined list} Add this flag to allow mEdit process base-editors. [default = off] [1-] “off”: disable base editor guides searching. [2-] “default”: use generic ABE and CBE with ‘NGG’ PAM and 4-8 base editing window [3-] “custom”: : select base editor search parameters. This requires a separate input of parameters : ‘be_pam’, ‘be_pamISfirst’,’be_guidelen’,’be_win’,’target_base’,’result_base’ [4-]"user defined list": - Comma-separated list chosen from the “medit list” of base editors
–guidelen endonuclease spacer length for a custom editor. [default =20] ONLY/MUST be defined for for ‘custom’ editor
-pamisfirst Whether the PAM site is 5’ of target site [default = False]. Can ONLY be used for a ‘custom’ editor
-pam pam sequence. string of IUPAC codes ONLY use for ‘custom’ endonuclease
—dsb_pos Double strand cut site relative to pam. This can be a single integer with a blunt end endonuclease or 2 integers separated by a single comma when using an endonuclease that produces staggered end cuts. for example spCas9 would be “-3” and Cas12 is “18,22” ONLY use for ‘custom’ endonuclease
—-edit_win Two positive integers separated by a comma that represent the base editing window. The numbering begins at the 5’ most end. ex. CBE window is “4,8" ONLY use for ‘custom’ be
—target_base (“A”,”T”,”C”,”G”) a single base that the custom base editor will target ex. ABE target base is “A” ONLY use for ‘custom’ be
-–result_base (“A”,”T”,”C”,”G”) a single base that the custom base editor change the target to ex. ABE result base is “G” ONLY use for ‘custom’ be
--cutdist Max allowable window a variant start position can be from the editor cut site. This option is not available for base editors. [default = 7] ONLY use for ‘custom’ endonuclease
--dry Perform a dry run of mEdit.
SLURM OPTIONS
-p PARALLEL_PROCESSES Most processes in mEdit can be submitted to SLURM. When submitting mEdit jobs to SLURM, the user can specify the number of parallel processes that will be sent to the server [default = 1]
--ncores NCORES Specify the number of cores through which each parallel process will be computed. [default = 2]
--maxtime MAXTIME Specify the maximum amount of time allowed for each parallel job.Format example: 2 hours -> "2:00:00" [default = 1 hour]
  1. Off-target Prediction
Args Description
--dry Perform a dry run of mEdit.
INPUT/Output
-mm MISMATCH Max Number of mismatches to search for[default: 3]
-rb RNA_BULGE Max Number of RNA bulges to search for[default: 0]
-db DNA_BULGE Max Number of DNA bulges to search for[default: 0]
–csp –cut_site_position The DSB position of a custom editor. This position can be a range if using an overhang editor or a single position when using a blunt end editor.
-o OUTPUT Path to root directory where mEdit guide_prediction outputs were stored. "medit offtarget" can't operate if this path is incorrect. [default: mEdit_analysis_<jobtag>/]
--ncores NCORES Specify the number of cores through which each parallel process will be computed. [default = 2]
--maxtime MAXTIME Specify the maximum amount of time allowed for each parallel job.Format example: 2 hours -> "2:00:00" [default = 1 hour]
-d DB_PATH Provide the path where the "mEdit_database" directory was created ahead of the analysis using the "db_set" program.[default: ./mEdit_database]
-j JOBTAG Provide the tag associated with the current mEdit job. mEdit will generate a random jobtag by default
SLURM Options
-p PARALLEL_PROCESSES Most processes in mEdit can be submitted to SLURM. When submitting mEdit jobs to SLURM, the user can specify the number of parallel processes that will be sent to the server [default = 1]
--ncores NCORES Specify the number of cores through which each parallel process will be computed. [default = 2]
--maxtime MAXTIME Specify the maximum amount of time allowed for each parallel job.Format example: 2 hours -> "2:00:00" [default = 1 hour]

4. Off-target Analysis

License

Copyright ©20xx [see Other Notes, below]. The Regents of the University of California (Regents). All Rights Reserved. Permission to use, copy, modify, and distribute this software and its documentation for educational, research, and not-for-profit purposes, without fee and without a signed licensing agreement, is hereby granted, provided that the above copyright notice, this paragraph and the following two paragraphs appear in all copies, modifications, and distributions. Contact The Office of Technology Licensing, UC Berkeley, 2150 Shattuck Avenue, Suite 408, Berkeley, CA 94704-1362, otl@berkeley.edu, for commercial licensing opportunities.

[Optional: Created by John Smith and Mary Doe, Department of Statistics, University of California, Berkeley.]

IN NO EVENT SHALL REGENTS BE LIABLE TO ANY PARTY FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS DOCUMENTATION, EVEN IF REGENTS HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

REGENTS SPECIFICALLY DISCLAIMS ANY WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE SOFTWARE AND ACCOMPANYING DOCUMENTATION, IF ANY, PROVIDED HEREUNDER IS PROVIDED "AS IS". REGENTS HAS NO OBLIGATION TO PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS.

FAQ

Cite us

Contact

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

meditability-0.4.1.tar.gz (85.2 kB view details)

Uploaded Source

Built Distribution

meditability-0.4.1-py3-none-any.whl (179.6 kB view details)

Uploaded Python 3

File details

Details for the file meditability-0.4.1.tar.gz.

File metadata

  • Download URL: meditability-0.4.1.tar.gz
  • Upload date:
  • Size: 85.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.13

File hashes

Hashes for meditability-0.4.1.tar.gz
Algorithm Hash digest
SHA256 8811902fce7855ad0b225cf9ce3bb54f4bc6aa6ef010d13f5d2c9cc47374b222
MD5 2bdec1bb682e404edf3c9955edc5e158
BLAKE2b-256 4f009f690c80b334ae9aa184d207f4e33057765a3f7318e8fecdcf415c70359d

See more details on using hashes here.

File details

Details for the file meditability-0.4.1-py3-none-any.whl.

File metadata

  • Download URL: meditability-0.4.1-py3-none-any.whl
  • Upload date:
  • Size: 179.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.13

File hashes

Hashes for meditability-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0ef87942d49d3127e86b4f4479a66f7b23132d75860607a3b9c08619f61659ee
MD5 17f3b793d580d05ebaf3577fb827a444
BLAKE2b-256 3144c52f50693f377487e858035a2106946ed224960dab9f98c571584304affb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page