Skip to main content

Efficient subroutines for computing summary statistics for the SAM FLAG field

Project description

Given a stream of k$bit words, we seek to sum the bit values at indexes 0, 1, 2, …, k-1 across multiple words by computing k distinct sums. If the k-bit words are one-hot encoded then the sums corresponds to their frequencies.

This multiple-sum problem is a generalization of the population-count problem where we count the total number of set bits in independent machine words. We refer to this new problem as the positional population-count problem.

Using SIMD (Single Instruction, Multiple Data) instructions from recent Intel processors, we describe algorithms for computing the 16-bit position population count using about one eighth (0.125) of a CPU cycle per 16-bit word. Our best approach is about 140-fold faster than competitive code using only non-SIMD instructions in terms of CPU cycles.

This package contains the application of the efficient positional population count operator to computing summary statistics for the SAM FLAG field.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyflagstats-0.1.0.tar.gz (28.4 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page