Skip to main content

CodePUB

Project description

CodePUB

CodePUB (Code: precise, unique, balanced) is a Python tool developed for construction of balanced constant-weight Gray codes for detecting consecutive positives (DCP-CWGCs) for the efficient construction of combinatorial pooling schemes.

Cite as

Requirements

  • numpy >= 1.23.5
         pip install "numpy>=1.23.5"
    

How to use

Please read documentation for the package at codepub.readthedocs.

Description

To minimize complexity and reduce costs of high-throughput experiments, combinatorial pooling strategies have been developed. These experiments use an encoding scheme where items are mixed across multiple pools, with each pool containing multiple items. Measurements are performed at the pool level, producing a unique pattern of signals for each item, which is then decoded to determine individual item measurements. With fewer pools than items, this approach improves efficiency compared to individual testing. Carefully designed encoding/decoding schemes can also detect errors, leveraging the presence of items in multiple pools.

Combinatorial Pooling Illustration


Balanced Constant-Weight Gray Codes

We introduce balanced constant-weight Gray codes for detecting consecutive positives (DCP-CWGCs) as a novel combinatorial pooling strategy. Items are assigned to pools using a sequence of binary vectors (binary addresses). Addresses have constant Hamming weight, adjacent addresses differ by a Hamming distance of 2, and their OR-sums are unique. These properties enable error detection, ensure constant testing for individual items and consecutive item pairs, and maintain stable detection by balancing the number of items per pool.

To construct balanced DCP-CWGCs with adjustable parameters, we developed a branch-and-bound algorithm (BBA). This algorithm efficiently constructs DCP-CWGCs with a near-optimal balance for short to moderate code lengths (e.g., up to 3,000 items in under 250 seconds). It uses a depth-first heuristic to search for a Hamiltonian path in a bipartite graph formed by item addresses and their unions. To extend applicability and speed, an enhanced recursive branch-and-bound algorithm (rcBBA) constructs long codes by combining shorter, BBA-generated ones. Both methods are implemented in codePub.


Definition of DCP-CWGCs

A DCP-CWGC, denoted as ( (m, r, n) ), is a sequence of ( n ) distinct binary addresses, ( C = {a_1, a_2, \dots, a_n} ), where each address ( a_j = (a_{j,1}, a_{j,2}, \dots, a_{j,m}) ) satisfies the following constraints:

  1. Distinct OR-sums: For all ( j, k \in {1, 2, \dots, n-1} ) and ( j \neq k ):

    [ a_j \vee a_{j+1} \neq a_k \vee a_{k+1}, ]

    where ( \vee ) is the bitwise OR operation.

  2. Constant weight: For all ( j \in {1, 2, \dots, n} ):

    [ \sum_{i=1}^m a_{j,i} = r. ]

  3. Adjacent distance: For all ( j \in {1, 2, \dots, n-1} ):

    [ D_H(a_j, a_{j+1}) = 2, ]

    where ( D_H ) represents the Hamming distance.


Mapping and Constraints

This design maps directly to a pooling arrangement with ( m ) pools, weight ( r ), and ( n ) items.

  • Constraint (1) ensures unique identification of consecutive item pairs.
  • Constraint (2) ensures consistent testing for each item.
  • Together, Constraints (2) and (3) maintain a constant OR-sum weight (( r+1 )) for consecutive addresses, resulting in consistent testing for consecutive item pairs.

Consequently, the scheme can detect single or consecutive positives and identify at least one error using the count of positive pools.


BBA and rcBBA

The details on both algorithms are described in our preprint: "Unbiased and Error-Detecting Combinatorial Pooling Experiments with Balanced Constant-Weight Gray Codes for Consecutive Positives Detection".

Authors

Authors Guanchen He, Qin Huang
Affiliation Beihang University
----------- ------------------------------------------------------
Authors Vasilisa Kovaleva, Hannah Meyer
----------- ------------------------------------------------------
Affiliation Cold Spring Harbor Laboratory
----------- ------------------------------------------------------
Authors Mikhail Pogorelyy, Paul Thomas
----------- ------------------------------------------------------
Affiliation St. Jude Research Hospital
----------- ------------------------------------------------------
Authors Carl Barton
----------- ------------------------------------------------------
Affiliation Birkbeck, University of London
Date 09/02/2024

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

codepub-2.0.tar.gz (8.7 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

codepub-2.0-py3-none-any.whl (1.6 MB view details)

Uploaded Python 3

File details

Details for the file codepub-2.0.tar.gz.

File metadata

  • Download URL: codepub-2.0.tar.gz
  • Upload date:
  • Size: 8.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.11

File hashes

Hashes for codepub-2.0.tar.gz
Algorithm Hash digest
SHA256 992a21b087d844d8657431297fb99902506e7bee9ef4e5abfc780df307f0afe1
MD5 11d96909c10d7ba21b7e4b7bb1fe2fd6
BLAKE2b-256 88548f07d94d648f2caa2f44d07348c311eaf654d6cf869b0bb7e0d20b694fff

See more details on using hashes here.

File details

Details for the file codepub-2.0-py3-none-any.whl.

File metadata

  • Download URL: codepub-2.0-py3-none-any.whl
  • Upload date:
  • Size: 1.6 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.11

File hashes

Hashes for codepub-2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 59a3e017dcdb2fd99f43cfc720326238633861c5dda8c1b86b968b710187112a
MD5 600125155044ace95f7c8ae113b1a835
BLAKE2b-256 d8f33d4c2b56f4a923f2d76f2eb5f34dfb2181156fb5549eacaf7f3fc177427e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page