A light-weight python package for summarizing sequence coverage from SAM and BAM files

These details have not been verified by PyPI

Project links

Homepage

Project description

samsum

A light-weight python package for summarizing sequence coverage from SAM and BAM files

tests build

Installation

Samsum is currently supported on Mac and Linux systems and has been tested primarily on Ubuntu operating systems (bionic and trusty distributions). It is a python package on the Python Package Index (PyPI) and can be installed using pip:

pip install samsum

Samsum can also be installed using conda with the command:

conda install -c bioconda samsum

You can also install samsum from source by cloning the directory from its GitHub page or downloading a GitHub release.

git clone https://github.com/hallamlab/samsum.git
cd samsum
python3 setup.py sdist
pip install dist/samsum*tar.gz

Usage

samsum stats will read either a SAM or BAM file (this functionality will be implemented soon) and rapidly count the number of reads mapped to each reference sequence (e.g. contigs, scaffolds) while also keeping track of the reads that remain unmapped. This all occurs within the C++ Python extension. It will then read the reference FASTA file to gather the lengths of each reference sequence. Combining the read counts and sequence lengths, it will then calculate:

fragments per kilobase per million (FPKM)
transcripts per milllion (TPM)

Command-line options

By default, reads with multiple identical alignments (i.e. mapping quality is 0) are not included in these calculations. This can be toggled off to include these alignments with the --m flag. Another option is to drop counts for reference sequences if only a portion of a sequence is mapped to. With the -p argument, you can control the minimum proportion a reference sequence needs to be covered for its read counts to be included in the output; all stats are otherwise set to 0.

An example command is:

samsum stats -f ref.fasta -a alignments.sam --multireads -p 0.5 -o output_dir/samsum_table.tsv

This will include all alignments, regardless of their mapping quality but only report alignments for reference sequences that were covered across at least 50% of their length.

API

Being a python package, samsum can also be readily imported into python code and used via its API.

The function generally desired would be ref_sequence_abundances. Usage could be:

from samsum import commands
sam="/home/user/reads_to_genome.sam"
fasta="/home/user/genome.fasta"
ref_seq_abunds = commands.ref_sequence_abundances(aln_file=sam, seq_file=fasta, min_aln=10, p_cov=0, map_qual=0)

The ref_seq_abunds object is a dictionary of RefSequence instances indexed by their header/sequence names. RefSequence objects have several variables that are of interest:

self.name is the name of the (reference) sequence or header
self.length is the length (in base-pairs) of the sequence
self.reads_mapped is the number of reads that were mapped
self.weight_total is the number of fragments (float) that were mapped to the sequence
self.fpkm is Fragments Per Kilobase per Million mapped reads
self.tpm is Transcripts Per Million mapped reads

Outputs

If samsum stats was executed, a "samsum_log.txt" file is written to the current working directory (i.e. where samsum was executed from). A comma-separated value (CSV) file with the fields "QueryName", "RefSequence", "ProportionCovered", "Coverage", "Fragments", "FPKM" and "TPM" is written to a file path specified on the command-line, or by default "samsum_table.csv". A TSV file can be written instead if the sep argument was modified to 'tab'.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.1.4

Jan 5, 2021

0.1.3

Jan 4, 2021

0.1.2

Mar 30, 2020

0.1.1

Mar 27, 2020

0.1.0

Mar 24, 2020

0.0.8

Mar 21, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

samsum-0.1.4.tar.gz (1.7 MB view details)

Uploaded Jan 5, 2021 Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

samsum-0.1.4-cp38-cp38-manylinux2014_x86_64.whl (396.4 kB view details)

Uploaded Jan 5, 2021 CPython 3.8

samsum-0.1.4-cp38-cp38-macosx_10_14_x86_64.whl (57.1 kB view details)

Uploaded Jan 5, 2021 CPython 3.8macOS 10.14+ x86-64

samsum-0.1.4-cp37-cp37m-manylinux2014_x86_64.whl (398.6 kB view details)

Uploaded Jan 5, 2021 CPython 3.7m

samsum-0.1.4-cp37-cp37m-macosx_10_14_x86_64.whl (57.2 kB view details)

Uploaded Jan 5, 2021 CPython 3.7mmacOS 10.14+ x86-64

samsum-0.1.4-cp36-cp36m-manylinux2014_x86_64.whl (395.8 kB view details)

Uploaded Jan 5, 2021 CPython 3.6m

samsum-0.1.4-cp36-cp36m-macosx_10_14_x86_64.whl (57.2 kB view details)

Uploaded Jan 5, 2021 CPython 3.6mmacOS 10.14+ x86-64

File details

Details for the file samsum-0.1.4.tar.gz.

File metadata

Download URL: samsum-0.1.4.tar.gz
Upload date: Jan 5, 2021
Size: 1.7 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.0 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.8.7

File hashes

Hashes for samsum-0.1.4.tar.gz
Algorithm	Hash digest
SHA256	`36b1fc0eaa4da1b7a70d2791357c7547612c9653e4e0ea7db39ba96cdd03ffc5`
MD5	`49b234aa3363ae346480f497573bad75`
BLAKE2b-256	`2e7297a9685d229361484444748f4cd9be3ea9ce3abbe2c0fa21b336ee9c071f`

See more details on using hashes here.

File details

Details for the file samsum-0.1.4-cp38-cp38-manylinux2014_x86_64.whl.

File metadata

Download URL: samsum-0.1.4-cp38-cp38-manylinux2014_x86_64.whl
Upload date: Jan 5, 2021
Size: 396.4 kB
Tags: CPython 3.8
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.0 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.8.7

File hashes

Hashes for samsum-0.1.4-cp38-cp38-manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`c592f7a74370e3306174cf6cca479227b38385eb11805f4ddb42fae8b19fa891`
MD5	`f71e91b196f8375032393ffe61221f1d`
BLAKE2b-256	`2840331693ef76326d004d38d5900b14f14599725c1f890f7093f3eda7cfdf8b`

See more details on using hashes here.

File details

Details for the file samsum-0.1.4-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

Download URL: samsum-0.1.4-cp38-cp38-macosx_10_14_x86_64.whl
Upload date: Jan 5, 2021
Size: 57.1 kB
Tags: CPython 3.8, macOS 10.14+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.0 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.8.7

File hashes

Hashes for samsum-0.1.4-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm	Hash digest
SHA256	`02d33ae2d7517252c63ccde0be796625511786c7e408d644729dde3a52ce383f`
MD5	`5a25f906924da4014673ab7a4c551dfa`
BLAKE2b-256	`e713968d6eb46778e3b8bb7c6bdf398c28e5da166e8a00e2ff4501af5fa62e6b`

See more details on using hashes here.

File details

Details for the file samsum-0.1.4-cp37-cp37m-manylinux2014_x86_64.whl.

File metadata

Download URL: samsum-0.1.4-cp37-cp37m-manylinux2014_x86_64.whl
Upload date: Jan 5, 2021
Size: 398.6 kB
Tags: CPython 3.7m
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.0 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.8.7

File hashes

Hashes for samsum-0.1.4-cp37-cp37m-manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`7a820cadafdd471420f8968e605ea6bc9d7a4b7897d5590a2de91c4f34cb0e97`
MD5	`acc3a8280a5ac597cdf07d7c0697c638`
BLAKE2b-256	`875cbe99b14614b7c77671abf76313da642f7ccbfe96710a6910a31daacfbca2`

See more details on using hashes here.

File details

Details for the file samsum-0.1.4-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

Download URL: samsum-0.1.4-cp37-cp37m-macosx_10_14_x86_64.whl
Upload date: Jan 5, 2021
Size: 57.2 kB
Tags: CPython 3.7m, macOS 10.14+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.0 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.8.7

File hashes

Hashes for samsum-0.1.4-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm	Hash digest
SHA256	`1689806f51e4303c72f439b68e10a04ca68338de4434e78e1d5092f87fa428b0`
MD5	`f3dc1bb0093605c759f9b81dea1739e1`
BLAKE2b-256	`4f23480f4f8706bfeb36af32c265aa2507f4f75c58d29852ab58f172b9c057c6`

See more details on using hashes here.

File details

Details for the file samsum-0.1.4-cp36-cp36m-manylinux2014_x86_64.whl.

File metadata

Download URL: samsum-0.1.4-cp36-cp36m-manylinux2014_x86_64.whl
Upload date: Jan 5, 2021
Size: 395.8 kB
Tags: CPython 3.6m
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.0 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.8.7

File hashes

Hashes for samsum-0.1.4-cp36-cp36m-manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`4b7008c8c6ded9594a562423324bb88092a0aec396e16b48907e00f30290033c`
MD5	`2fe829ae258edc5cbec6452e53af1752`
BLAKE2b-256	`1cd6dd28fffbce4b3a4d5bf0714816c9762d422ab6ff16f545417739cb1fea08`

See more details on using hashes here.

File details

Details for the file samsum-0.1.4-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

Download URL: samsum-0.1.4-cp36-cp36m-macosx_10_14_x86_64.whl
Upload date: Jan 5, 2021
Size: 57.2 kB
Tags: CPython 3.6m, macOS 10.14+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.0 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.8.7

File hashes

Hashes for samsum-0.1.4-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm	Hash digest
SHA256	`0af11ec16faf3c65e1342f925de58002a9e585a9ee376ea1b82a92be5dcf1c28`
MD5	`2ecb2fd82b5cc44eb25e98045e0c3531`
BLAKE2b-256	`06261e37c72dd819ace1067aac5033ad25993005992a18e0e62d2af1ad6fbc33`

See more details on using hashes here.

samsum 0.1.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

samsum

Installation

Usage

Command-line options

API

Outputs

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distributions

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes