CodePUB
Project description
CodePUB
CodePUB (Code: precise, unique, balanced) is a Python tool developed for construction of balanced constant-weight Gray codes for detecting consecutive positives (DCP-CWGCs) for the efficient construction of combinatorial pooling schemes.
Cite as
Requirements
- numpy >= 1.23.5
pip install "numpy>=1.23.5"
How to use
Please read documentation for the package at codepub.readthedocs.
Description
To minimize complexity and reduce costs of high-throughput experiments, combinatorial pooling strategies have been developed. These experiments use an encoding scheme where items are mixed across multiple pools, with each pool containing multiple items. Measurements are performed at the pool level, producing a unique pattern of signals for each item, which is then decoded to determine individual item measurements. With fewer pools than items, this approach improves efficiency compared to individual testing. Carefully designed encoding/decoding schemes can also detect errors, leveraging the presence of items in multiple pools.
Balanced Constant-Weight Gray Codes
We introduce balanced constant-weight Gray codes for detecting consecutive positives (DCP-CWGCs) as a novel combinatorial pooling strategy. Items are assigned to pools using a sequence of binary vectors (binary addresses). Addresses have constant Hamming weight, adjacent addresses differ by a Hamming distance of 2, and their OR-sums are unique. These properties enable error detection, ensure constant testing for individual items and consecutive item pairs, and maintain stable detection by balancing the number of items per pool.
To construct balanced DCP-CWGCs with adjustable parameters, we developed a branch-and-bound algorithm (BBA). This algorithm efficiently constructs DCP-CWGCs with a near-optimal balance for short to moderate code lengths (e.g., up to 3,000 items in under 250 seconds). It uses a depth-first heuristic to search for a Hamiltonian path in a bipartite graph formed by item addresses and their unions. To extend applicability and speed, an enhanced recursive branch-and-bound algorithm (rcBBA) constructs long codes by combining shorter, BBA-generated ones. Both methods are implemented in codePub.
Definition of DCP-CWGCs
A DCP-CWGC, denoted as ( (m, r, n) ), is a sequence of ( n ) distinct binary addresses, ( C = {a_1, a_2, \dots, a_n} ), where each address ( a_j = (a_{j,1}, a_{j,2}, \dots, a_{j,m}) ) satisfies the following constraints:
-
Distinct OR-sums: For all ( j, k \in {1, 2, \dots, n-1} ) and ( j \neq k ):
[ a_j \vee a_{j+1} \neq a_k \vee a_{k+1}, ]
where ( \vee ) is the bitwise OR operation.
-
Constant weight: For all ( j \in {1, 2, \dots, n} ):
[ \sum_{i=1}^m a_{j,i} = r. ]
-
Adjacent distance: For all ( j \in {1, 2, \dots, n-1} ):
[ D_H(a_j, a_{j+1}) = 2, ]
where ( D_H ) represents the Hamming distance.
Mapping and Constraints
This design maps directly to a pooling arrangement with ( m ) pools, weight ( r ), and ( n ) items.
- Constraint (1) ensures unique identification of consecutive item pairs.
- Constraint (2) ensures consistent testing for each item.
- Together, Constraints (2) and (3) maintain a constant OR-sum weight (( r+1 )) for consecutive addresses, resulting in consistent testing for consecutive item pairs.
Consequently, the scheme can detect single or consecutive positives and identify at least one error using the count of positive pools.
BBA and rcBBA
The details on both algorithms are described in our preprint: "Unbiased and Error-Detecting Combinatorial Pooling Experiments with Balanced Constant-Weight Gray Codes for Consecutive Positives Detection".
Authors
| Authors | Guanchen He, Qin Huang |
|---|---|
| Affiliation | Beihang University |
| ----------- | ------------------------------------------------------ |
| Authors | Vasilisa Kovaleva, Hannah Meyer |
| ----------- | ------------------------------------------------------ |
| Affiliation | Cold Spring Harbor Laboratory |
| ----------- | ------------------------------------------------------ |
| Authors | Mikhail Pogorelyy, Paul Thomas |
| ----------- | ------------------------------------------------------ |
| Affiliation | St. Jude Research Hospital |
| ----------- | ------------------------------------------------------ |
| Authors | Carl Barton |
| ----------- | ------------------------------------------------------ |
| Affiliation | Birkbeck, University of London |
| Date | 09/02/2024 |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file codepub-2.0.tar.gz.
File metadata
- Download URL: codepub-2.0.tar.gz
- Upload date:
- Size: 8.7 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
992a21b087d844d8657431297fb99902506e7bee9ef4e5abfc780df307f0afe1
|
|
| MD5 |
11d96909c10d7ba21b7e4b7bb1fe2fd6
|
|
| BLAKE2b-256 |
88548f07d94d648f2caa2f44d07348c311eaf654d6cf869b0bb7e0d20b694fff
|
File details
Details for the file codepub-2.0-py3-none-any.whl.
File metadata
- Download URL: codepub-2.0-py3-none-any.whl
- Upload date:
- Size: 1.6 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
59a3e017dcdb2fd99f43cfc720326238633861c5dda8c1b86b968b710187112a
|
|
| MD5 |
600125155044ace95f7c8ae113b1a835
|
|
| BLAKE2b-256 |
d8f33d4c2b56f4a923f2d76f2eb5f34dfb2181156fb5549eacaf7f3fc177427e
|