Skip to main content

CodePUB

Project description

CodePUB

CodePUB (Code: precise, unique, balanced) is a Python tool developed for construction of balanced constant-weight Gray codes for detecting consecutive positives (DCP-CWGCs) for the efficient construction of combinatorial pooling schemes.

Cite as

Requirements

  • numpy >= 1.23.5
         pip install "numpy>=1.23.5"
    

How to use

Please read documentation for the package at codepub.readthedocs.

Description

To minimize complexity and reduce costs of high-throughput experiments, combinatorial pooling strategies have been developed. These experiments use an encoding scheme where items are mixed across multiple pools, with each pool containing multiple items. Measurements are performed at the pool level, producing a unique pattern of signals for each item, which is then decoded to determine individual item measurements. With fewer pools than items, this approach improves efficiency compared to individual testing. Carefully designed encoding/decoding schemes can also detect errors, leveraging the presence of items in multiple pools.

Combinatorial Pooling Illustration


Balanced Constant-Weight Gray Codes

We introduce balanced constant-weight Gray codes for detecting consecutive positives (DCP-CWGCs) as a novel combinatorial pooling strategy. Items are assigned to pools using a sequence of binary vectors (binary addresses). Addresses have constant Hamming weight, adjacent addresses differ by a Hamming distance of 2, and their OR-sums are unique. These properties enable error detection, ensure constant testing for individual items and consecutive item pairs, and maintain stable detection by balancing the number of items per pool.

To construct balanced DCP-CWGCs with adjustable parameters, we developed a branch-and-bound algorithm (BBA). This algorithm efficiently constructs DCP-CWGCs with a near-optimal balance for short to moderate code lengths (e.g., up to 3,000 items in under 250 seconds). It uses a depth-first heuristic to search for a Hamiltonian path in a bipartite graph formed by item addresses and their unions. To extend applicability and speed, an enhanced recursive branch-and-bound algorithm (rcBBA) constructs long codes by combining shorter, BBA-generated ones. Both methods are implemented in codePub.


Definition of DCP-CWGCs

A DCP-CWGC, denoted as (m, r, n), is a sequence of n distinct binary addresses, C = {a_1, a_2, ..., a_n}, where each address a_j = (a_{j,1}, a_{j,2}, ..., a_{j,m}) satisfies the following constraints:

  1. Distinct OR-sums:
    For all j, k ∈ {1, 2, ..., n-1} and j ≠ k:

    a_j ∨ a_{j+1} ≠ a_k ∨ a_{k+1}

    where is the bitwise OR operation.

  2. Constant weight:
    For all j ∈ {1, 2, ..., n}:

    ∑{i=1}^m a{j,i} = r

  3. Adjacent distance:
    For all j ∈ {1, 2, ..., n-1}:

    D_H(a_j, a_{j+1}) = 2

    where D_H represents the Hamming distance.


Mapping and Constraints

This design maps directly to a pooling arrangement with ( m ) pools, weight ( r ), and ( n ) items.

  • Constraint (1) ensures unique identification of consecutive item pairs.
  • Constraint (2) ensures consistent testing for each item.
  • Together, Constraints (2) and (3) maintain a constant OR-sum weight (( r+1 )) for consecutive addresses, resulting in consistent testing for consecutive item pairs.

Consequently, the scheme can detect single or consecutive positives and identify at least one error using the count of positive pools.


BBA and rcBBA

The details on both algorithms are described in our preprint: "Unbiased and Error-Detecting Combinatorial Pooling Experiments with Balanced Constant-Weight Gray Codes for Consecutive Positives Detection".

Authors

Authors Guanchen He, Qin Huang
Affiliation Beihang University
----------- ------------------------------------------------------
Authors Vasilisa Kovaleva, Hannah Meyer
----------- ------------------------------------------------------
Affiliation Cold Spring Harbor Laboratory
----------- ------------------------------------------------------
Authors Mikhail Pogorelyy, Paul Thomas
----------- ------------------------------------------------------
Affiliation St. Jude Research Hospital
----------- ------------------------------------------------------
Authors Carl Barton
----------- ------------------------------------------------------
Affiliation Birkbeck, University of London
Date 09/02/2024

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

codepub-2.3.tar.gz (10.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

codepub-2.3-py3-none-any.whl (1.6 MB view details)

Uploaded Python 3

File details

Details for the file codepub-2.3.tar.gz.

File metadata

  • Download URL: codepub-2.3.tar.gz
  • Upload date:
  • Size: 10.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.9

File hashes

Hashes for codepub-2.3.tar.gz
Algorithm Hash digest
SHA256 5be18e882666963014e5f8c0b9c95b82924ae8e6f4788ce7b978d72614fa0e54
MD5 444d252f008ea47581f2d1f2074d349d
BLAKE2b-256 a19b5f3516229add335ca1c42b47ca166a5714d4f5b913debe309c2db80289cb

See more details on using hashes here.

File details

Details for the file codepub-2.3-py3-none-any.whl.

File metadata

  • Download URL: codepub-2.3-py3-none-any.whl
  • Upload date:
  • Size: 1.6 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.9

File hashes

Hashes for codepub-2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 bfd4e88d7991be9ee91ae2ec2daa2e908cb3e185ee6a0ccf8353c6b9b6ad32bf
MD5 5bc55966a5fbd58e00cdd0f5948ffb4b
BLAKE2b-256 8c597e50604329b1d4072615f98721d7e66d2608f593b52777707faba20b1dd9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page