Interval-based PCR target window planner
Project description
primer-target-planner
Interval-based PCR target window planner.
Given a set of required genomic intervals (e.g. CDS exons) and PCR product size constraints, generate the minimal set of target windows that fully cover every required interval.
This is a pure algorithm library — no primer design, no external bioinformatics services.
This library uses 0-based half-open intervals:
[start, end),length = end - start.All coordinates in
RequiredInterval,PlanningBounds, andTargetWindowfollow this convention. For example,RequiredInterval("exon1", 1000, 1200)covers positions 1000–1199 inclusive (200 bp).
Install
pip install -e ".[dev]"
Quick start
from primer_target_planner import (
plan_targets,
PlannerConfig,
PlanningBounds,
RequiredInterval,
)
intervals = [
RequiredInterval("exon1", 1000, 1200), # 200 bp
RequiredInterval("exon2", 1500, 1800), # 300 bp
RequiredInterval("exon3", 2200, 2500), # 300 bp
]
config = PlannerConfig(
product_min=600,
product_max=1000,
strand="+",
)
targets = plan_targets(intervals, config)
for t in targets:
print(
f"[{t.start}, {t.end}) len={t.length} mode={t.planning_mode} "
f"covers={t.covered_ids} reason={t.reason}"
)
Negative-strand example
On the negative strand the planner processes intervals from high genomic
coordinates (transcript 5') to low coordinates (transcript 3'). Input and
output coordinates are always genomic start < end — the strand only affects
planning order, not coordinate direction.
from primer_target_planner import (
plan_targets,
PlannerConfig,
RequiredInterval,
)
# Four exons on the minus strand.
# Transcript order (5'→3'): exonD → exonC → exonB → exonA
# (high genomic coords → low genomic coords)
intervals = [
RequiredInterval("exonA", 300, 400),
RequiredInterval("exonB", 700, 800),
RequiredInterval("exonC", 1100, 1200),
RequiredInterval("exonD", 1500, 1600),
]
config = PlannerConfig(product_min=500, product_max=900, strand="-")
targets = plan_targets(intervals, config)
for t in targets:
# start < end always — genomic coordinates, not transcript direction
print(
f"[{t.start}, {t.end}) len={t.length} mode={t.planning_mode} "
f"covers={t.covered_ids}"
)
# Possible output:
# [701, 1600) len=899 mode=product_max covers=['exonD', 'exonC']
# [300, 800) len=500 mode=product_max covers=['exonB', 'exonA']
API
RequiredInterval
| Field | Type | Description |
|---|---|---|
id |
str |
Identifier (e.g. exon name) |
start |
int |
Genomic start (0-based, inclusive) |
end |
int |
Genomic end (exclusive) |
metadata |
dict | None |
Optional user metadata |
All coordinates are 0-based half-open [start, end).
length = end - start.
PlannerConfig
| Field | Type | Default | Description |
|---|---|---|---|
product_min |
int |
— | Minimum PCR product length (bp) |
product_max |
int |
— | Maximum PCR product length (bp) |
strand |
"+" | "-" |
— | Transcript strand |
tile_overlap |
int |
200 |
Overlap between tiles for long spans |
allow_overlap |
bool |
True |
Allow adjacent targets to overlap |
PlanningBounds
| Field | Type | Description |
|---|---|---|
start |
int |
Gene / transcript genomic start (inclusive) |
end |
int |
Gene / transcript genomic end (exclusive) |
0-based half-open [start, end). length = end - start.
TargetWindow
| Field | Type | Description |
|---|---|---|
start |
int |
Genomic start (inclusive) |
end |
int |
Genomic end (exclusive) |
length |
int |
end - start |
covered_ids |
list[str] |
IDs of fully covered intervals |
anchor_id |
str |
The interval that anchored this target |
anchor_side |
"5prime" | "3prime" |
Anchor side |
planning_mode |
str |
product_min, product_max, single, terminal_reverse, tiled |
reason |
str |
Human-readable explanation |
plan_targets(intervals, config, bounds=None) -> list[TargetWindow]
Main entry point.
intervals: required intervals (any order; sorted internally).config: product-size and strand configuration.bounds: optional gene extent; enables terminal-reverse logic.
Returns target windows in transcript 5'→3' order.
Algorithm
Min-first / max-rescue planner
Processing proceeds from the transcript 5' end:
-
Try
product_min— if a window ofproduct_minbp can fully cover the next consecutive required interval, merge it. Continue merging whileproduct_minstill covers the next interval. -
Try
product_max— ifproduct_mincannot cover the next interval butproduct_maxcan, useproduct_maxand merge all intervals it covers. -
Independent target — if neither size covers the next interval, the current anchor becomes its own target and the next interval starts a new anchor.
-
Terminal reverse — if a forward window from the current anchor would extend past the gene 3' boundary, instead anchor at the gene 3' end and extend toward 5'. Tries
product_minfirst; upgrades toproduct_maxif the previous interval can also be covered. Window:[gene_end - product_size, gene_end). -
Tiling — when a single required interval exceeds
product_max, it is automatically tiled into overlapping windows ofproduct_maxbp withtile_overlapbp overlap.
Coverage rule
A required interval is considered fully covered only when:
target.start <= interval.start AND target.end >= interval.end
Partial coverage does not count. A target that overlaps an interval but does not span its full extent does not mark that interval as covered.
Bounds behaviour
bounds provided? |
Behaviour |
|---|---|
No (None) |
Gene extent is inferred from the intervals themselves. Windows may extend freely beyond the inferred span. Terminal reverse is not triggered (there is no external 3' boundary to respect). |
| Yes | The planner keeps all windows within [bounds.start, bounds.end). When a forward window would extend past the 3' boundary, terminal reverse anchors at bounds.end and extends toward 5'. |
Providing bounds is recommended when you know the gene / transcript extent — it prevents targets from stretching beyond the biological region and enables the terminal-reverse optimisation at the 3' end.
Strand handling
- "+" strand: 5' is at low genomic coordinates; intervals are processed in ascending genomic order.
- "-" strand: 5' is at high genomic coordinates; intervals are processed in descending genomic order.
- All output coordinates are genomic
start < end. The strand only affects planning order, never coordinate direction.
Running tests
python -m pytest -q
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file primer_target_planner-0.1.0.tar.gz.
File metadata
- Download URL: primer_target_planner-0.1.0.tar.gz
- Upload date:
- Size: 12.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8c90afd87d15f50a100efae6cd043e5041ab39596989b7a86a795ac43861cede
|
|
| MD5 |
fbb6a0734842d60ef2cf8498f0e7cebe
|
|
| BLAKE2b-256 |
42aa44e5779cc2808262b4e1de883a383ac496f9a684858d2b5e096ab5e3f4b9
|
File details
Details for the file primer_target_planner-0.1.0-py3-none-any.whl.
File metadata
- Download URL: primer_target_planner-0.1.0-py3-none-any.whl
- Upload date:
- Size: 8.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fd7f2299493021510df7f19451afabfd31b63dcd05f5c0c43ae80fd86a8a0622
|
|
| MD5 |
86310a3d81a7179dd9dc53e6738766ac
|
|
| BLAKE2b-256 |
00c461c2a15d6399511bd8c2486c05d8ece955e808fdbb830c896817bc199bab
|