Project description

ngs-analysis

Intended for analysis of sequencing reads that span multiple DNA or protein parts. For instance, given a library of protein variants linked to DNA barcodes, it can answer questions like:

How accurate are the variant sequences, at the DNA or protein level?
How frequently is the same barcode linked to two different variants?
Which reads contain parts required for function (e.g., a kozak start sequence, or a fused protein tag)?

This kind of analysis often involves parsing raw sequencing reads for DNA and/or protein sub-sequences (parts), then mapping the parts to a reference of anticipated part combinations. This package offers a simple workflow:

Define how to parse reads into parts using plain text expressions (no code)
Test the parser on simulated DNA sequences (e.g., your vector map)
Parse a batch of sequencing samples
Map the (combination of) parts found in each read to your reference

It’s been tested with Illumina paired-end reads and Oxford Nanopore long reads. Under the hood it uses NGmerge to merge paired reads and MMseqs2 for sequencing mapping. It is moderately performant: 1 million paired-end reads can be mapped to a reference of 100,000 variant-barcode pairs in ~1 minute.

Installation

pip install ngs-analysis

Tested on Linux and MacOS (Apple Silicon).

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.0.6

Feb 9, 2024

0.0.5

Feb 6, 2024

0.0.4

Jan 24, 2024

This version

0.0.3

Dec 29, 2023

0.0.2

Dec 29, 2023

0.0.1

Dec 29, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ngs-analysis-0.0.3.tar.gz (22.8 kB view hashes)

Uploaded Dec 29, 2023 Source

Built Distribution

ngs_analysis-0.0.3-py3-none-any.whl (23.0 kB view hashes)

Uploaded Dec 29, 2023 Python 3

Hashes for ngs-analysis-0.0.3.tar.gz

Hashes for ngs-analysis-0.0.3.tar.gz
Algorithm	Hash digest
SHA256	`33a5016d21e8fb6a06bcd61fbb084f517dfb5c86dd2bd61687b0c18de1c2ea9e`
MD5	`5d29778ea942505ba37fe0f7625597a1`
BLAKE2b-256	`1250b110012e295355855615a2773e888cc865129dc710a30ac17f10b8b8b0fa`

Hashes for ngs_analysis-0.0.3-py3-none-any.whl

Hashes for ngs_analysis-0.0.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`31811244ed69a3ad38b2d1acbb67d3b65bc706d612d32cb4dd6fda27a64ddf02`
MD5	`1cc23225effac9d15e020b1b69df5dc7`
BLAKE2b-256	`260717e195265817c38a0655b485cc427346db2c928183c5deac14fe29fb2b20`