Tool to computationally deconvolve combinatorially pooled arrayed random mutagenesis libraries

These details have not been verified by PyPI

Project description

arraylib-solve

Introduction

arraylib-solve is a tool to deconvolve combinatorially pooled arrayed random mutagenesis libraries (e.g. by transposon mutagenesis). In a typical experiment generating arrayed mutagenesis libraries, first a pooled version of the library is created and arrayed on a grid of well plates. To infer the identities of each mutant on the well plate, wells are pooled in combinatorial manner such that each mutant appears in a unique combination of pools. The pools are then sequenced using NGS and sequenced reads are stored in individual fastq files per pool. arraylib-solve deconvolves the pools and returns summaries stating the identity and location of each mutant on the original well grid. The package is based on the approach described in [1].

Installation

To install arraylib-solve first create Python 3.8 environment e.g. by

conda create --name arraylib-env python=3.8
conda activate arraylib-env

and install the package using

pip install arraylib-solve

arraylib-solve uses bowtie2 [2] to align reads to the reference genome. Please ensure that bowtie2 is installed in your environment by running:

conda install -c bioconda bowtie2

How to run `arraylib-solve`

To run arraylib-solve on a library deconvolution experiment with default parameters run:

arraylib-run <input_directory> <experimental_design.csv> -c <number_of_cpu_cores_to_use> -gb <path_to_genbank_reference> -br <path_to_bowtie2_indices> -t <transposon_sequence> -bu <upstream_sequence_of_barcodes> -bd <downstream_sequence_of_barcodes>

Input parameters

Required parameters:

input_dir: path to directory holding the input fastq files
exp_design: path to csv file indicating experimental design (values should be separated by a comma). The experimental design file should have columns, Filename, Poolname and Pooldimension. (see example in tests/test_data/full_exp_design.csv)
- Filename should contain all the unqiue input fastq filenames.
- Poolname should indicate to which pool a given file belongs. Multiple files per poolname are allowed.
- Pooldimension indicates the pooling dimension a pool belongs to. All pools sharing the same pooling dimension should have the same string in the Pooldimension column.

An example of how an exp_design file could look like:

Filename	Poolname	Pooldimension
column1.fastq	column1	columns
column2.fastq	column2	columns
row1.fastq	row1	rows
row2.fastq	row2	rows
platerow1.fastq	platerow1	platerows
platerow2.fastq	platerow2	platerows
platecol1.fastq	platecol1	platecols
platecol2.fastq	platecol2	platecols

-gb path to genbank reference file
-br path to bowtie index files, ending with the basename of your index (if the basename of your index is UTI89 and you store your bowtie2 references in bowtie_ref it should be bowtie_ref/UTI89). Please visit https://bowtie-bio.sourceforge.net/bowtie2/manual.shtml#the-bowtie2-build-indexer for a manual how to create bowtie2 indices.
-t transposon sequence (e.g. AGATGTGTATAAGAGACAG)
-bu upstream sequence of barcode (e.g. CGAGGTCTCT)
-bd downstream sequence of barcode (e.g. CGTACGCTGC)

Optional parameters:

-mq minimum bowtie2 alignment quality score for each base to include read
-sq minimum phred score for each base to include read
-tm number of transposon mismatches allowed
-thr threshold for local filter (e.g. a threshold of 0.05 would filter out all reads < 0.05 of the maximum read count for a given mutant)

Output

arraylib-solve outputs 4 files:

count_matrix.csv: Read counts per pool for each mutant.
filtered_matrix.csv: Read counts per pool for each mutant, but mutants with barcodes with low read counts for a given genomic location are filtered out.
mutant_location_summary.csv: A summary of mutants found in the well plate grid, where each row corresponds to a different mutant.
well_location_summary.csv: A summary of the deconvolved well plate grid, where each row corresponds to a different well.

References

[1] Baym, M., Shaket, L., Anzai, I.A., Adesina, O. and Barstow, B., 2016. Rapid construction of a whole-genome transposon insertion collection for Shewanella oneidensis by Knockout Sudoku. Nature communications, 7(1), p.13270.
[2] Langmead, B. and Salzberg, S.L., 2012. Fast gapped-read alignment with Bowtie 2. Nature methods, 9(4), pp.357-359.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.31.0

Sep 7, 2023

0.30.0

Aug 28, 2023

0.29.0

Aug 28, 2023

0.28.0

Aug 28, 2023

0.27.0

Aug 28, 2023

0.26.0

Aug 28, 2023

0.25.0

Aug 28, 2023

0.24.0

Jul 14, 2023

0.23.0

Jul 6, 2023

0.22.0

Jul 6, 2023

0.21.0

Jul 5, 2023

0.20.0

Jul 5, 2023

0.19.0

Jul 5, 2023

0.18.0

Jul 4, 2023

0.17.0

Jun 30, 2023

0.16.0

Jun 30, 2023

0.15.0

Jun 22, 2023

0.14.0

May 26, 2023

0.13.0

May 26, 2023

0.12.0

May 26, 2023

0.11.0

May 26, 2023

This version

0.10.0

May 26, 2023

0.9.0

May 9, 2023

0.8.0

Mar 7, 2023

0.7.0

Feb 24, 2023

0.6.0

Feb 24, 2023

0.5.0

Feb 24, 2023

0.4.0

Feb 24, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arraylib_solve-0.10.0.tar.gz (17.4 kB view hashes)

Uploaded May 26, 2023 Source

Built Distribution

arraylib_solve-0.10.0-py3-none-any.whl (19.3 kB view hashes)

Uploaded May 26, 2023 Python 3

Hashes for arraylib_solve-0.10.0.tar.gz

Hashes for arraylib_solve-0.10.0.tar.gz
Algorithm	Hash digest
SHA256	`e6d42783250b17b5766558f9f88f77f4ec9b52c19ff398f8c1795b82e3b2c559`
MD5	`3c2bad5263b6ebe162c3e53802efeb1f`
BLAKE2b-256	`a059272f003234006da3143c97ed2db4974f1f96cf8076aac19226ffd404679f`

Hashes for arraylib_solve-0.10.0-py3-none-any.whl

Hashes for arraylib_solve-0.10.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`aa8b37fcd5b56b4cb8159063577ec22af72f8b813c95b581701df74d1b264849`
MD5	`6a25068c34b7573dbeac833011ce4741`
BLAKE2b-256	`3f794bf801e2c301b0e9ebb939edb88f9465fcd1b5c7fcfeaa328e7aa2bff4de`

arraylib-solve 0.10.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

arraylib-solve

Introduction

Installation

How to run `arraylib-solve`

Input parameters

Output

References

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

arraylib-solve 0.10.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

arraylib-solve

Introduction

Installation

How to run arraylib-solve

Input parameters

Output

References

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

How to run `arraylib-solve`