Skip to main content

Interval-based PCR target window planner

Project description

primer-target-planner

Interval-based PCR target window planner.

Given a set of required genomic intervals (e.g. CDS exons) and PCR product size constraints, generate the minimal set of target windows that fully cover every required interval.

This is a pure algorithm library — no primer design, no external bioinformatics services.

This library uses 0-based half-open intervals: [start, end), length = end - start.

All coordinates in RequiredInterval, PlanningBounds, and TargetWindow follow this convention. For example, RequiredInterval("exon1", 1000, 1200) covers positions 1000–1199 inclusive (200 bp).

Install

pip install -e ".[dev]"

Quick start

from primer_target_planner import (
    plan_targets,
    PlannerConfig,
    PlanningBounds,
    RequiredInterval,
)

intervals = [
    RequiredInterval("exon1", 1000, 1200),  # 200 bp
    RequiredInterval("exon2", 1500, 1800),  # 300 bp
    RequiredInterval("exon3", 2200, 2500),  # 300 bp
]

config = PlannerConfig(
    product_min=600,
    product_max=1000,
    strand="+",
)

targets = plan_targets(intervals, config)
for t in targets:
    print(
        f"[{t.start}, {t.end})  len={t.length}  mode={t.planning_mode}  "
        f"covers={t.covered_ids}  reason={t.reason}"
    )

Negative-strand example

On the negative strand the planner processes intervals from high genomic coordinates (transcript 5') to low coordinates (transcript 3'). Input and output coordinates are always genomic start < end — the strand only affects planning order, not coordinate direction.

from primer_target_planner import (
    plan_targets,
    PlannerConfig,
    RequiredInterval,
)

# Four exons on the minus strand.
# Transcript order (5'→3'): exonD → exonC → exonB → exonA
# (high genomic coords → low genomic coords)
intervals = [
    RequiredInterval("exonA", 300, 400),
    RequiredInterval("exonB", 700, 800),
    RequiredInterval("exonC", 1100, 1200),
    RequiredInterval("exonD", 1500, 1600),
]

config = PlannerConfig(product_min=500, product_max=900, strand="-")
targets = plan_targets(intervals, config)

for t in targets:
    # start < end always — genomic coordinates, not transcript direction
    print(
        f"[{t.start}, {t.end})  len={t.length}  mode={t.planning_mode}  "
        f"covers={t.covered_ids}"
    )
# Possible output:
# [701, 1600)  len=899  mode=product_max  covers=['exonD', 'exonC']
# [300, 800)   len=500  mode=product_max  covers=['exonB', 'exonA']

API

RequiredInterval

Field Type Description
id str Identifier (e.g. exon name)
start int Genomic start (0-based, inclusive)
end int Genomic end (exclusive)
metadata dict | None Optional user metadata

All coordinates are 0-based half-open [start, end). length = end - start.

PlannerConfig

Field Type Default Description
product_min int Minimum PCR product length (bp)
product_max int Maximum PCR product length (bp)
strand "+" | "-" Transcript strand
tile_overlap int 200 Overlap between tiles for long spans
allow_overlap bool True Allow adjacent targets to overlap

PlanningBounds

Field Type Description
start int Gene / transcript genomic start (inclusive)
end int Gene / transcript genomic end (exclusive)

0-based half-open [start, end). length = end - start.

TargetWindow

Field Type Description
start int Genomic start (inclusive)
end int Genomic end (exclusive)
length int end - start
covered_ids list[str] IDs of fully covered intervals
anchor_id str The interval that anchored this target
anchor_side "5prime" | "3prime" Anchor side
planning_mode str product_min, product_max, single, terminal_reverse, tiled
reason str Human-readable explanation

plan_targets(intervals, config, bounds=None) -> list[TargetWindow]

Main entry point.

  • intervals: required intervals (any order; sorted internally).
  • config: product-size and strand configuration.
  • bounds: optional gene extent; enables terminal-reverse logic.

Returns target windows in transcript 5'→3' order.

Algorithm

Min-first / max-rescue planner

Processing proceeds from the transcript 5' end:

  1. Try product_min — if a window of product_min bp can fully cover the next consecutive required interval, merge it. Continue merging while product_min still covers the next interval.

  2. Try product_max — if product_min cannot cover the next interval but product_max can, use product_max and merge all intervals it covers.

  3. Independent target — if neither size covers the next interval, the current anchor becomes its own target and the next interval starts a new anchor.

  4. Terminal reverse — if a forward window from the current anchor would extend past the gene 3' boundary, instead anchor at the gene 3' end and extend toward 5'. Tries product_min first; upgrades to product_max if the previous interval can also be covered. Window: [gene_end - product_size, gene_end).

  5. Tiling — when a single required interval exceeds product_max, it is automatically tiled into overlapping windows of product_max bp with tile_overlap bp overlap.

Coverage rule

A required interval is considered fully covered only when:

target.start <= interval.start  AND  target.end >= interval.end

Partial coverage does not count. A target that overlaps an interval but does not span its full extent does not mark that interval as covered.

Bounds behaviour

bounds provided? Behaviour
No (None) Gene extent is inferred from the intervals themselves. Windows may extend freely beyond the inferred span. Terminal reverse is not triggered (there is no external 3' boundary to respect).
Yes The planner keeps all windows within [bounds.start, bounds.end). When a forward window would extend past the 3' boundary, terminal reverse anchors at bounds.end and extends toward 5'.

Providing bounds is recommended when you know the gene / transcript extent — it prevents targets from stretching beyond the biological region and enables the terminal-reverse optimisation at the 3' end.

Strand handling

  • "+" strand: 5' is at low genomic coordinates; intervals are processed in ascending genomic order.
  • "-" strand: 5' is at high genomic coordinates; intervals are processed in descending genomic order.
  • All output coordinates are genomic start < end. The strand only affects planning order, never coordinate direction.

Running tests

python -m pytest -q

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

primer_target_planner-0.1.0.tar.gz (12.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

primer_target_planner-0.1.0-py3-none-any.whl (8.7 kB view details)

Uploaded Python 3

File details

Details for the file primer_target_planner-0.1.0.tar.gz.

File metadata

  • Download URL: primer_target_planner-0.1.0.tar.gz
  • Upload date:
  • Size: 12.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.6

File hashes

Hashes for primer_target_planner-0.1.0.tar.gz
Algorithm Hash digest
SHA256 8c90afd87d15f50a100efae6cd043e5041ab39596989b7a86a795ac43861cede
MD5 fbb6a0734842d60ef2cf8498f0e7cebe
BLAKE2b-256 42aa44e5779cc2808262b4e1de883a383ac496f9a684858d2b5e096ab5e3f4b9

See more details on using hashes here.

File details

Details for the file primer_target_planner-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for primer_target_planner-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fd7f2299493021510df7f19451afabfd31b63dcd05f5c0c43ae80fd86a8a0622
MD5 86310a3d81a7179dd9dc53e6738766ac
BLAKE2b-256 00c461c2a15d6399511bd8c2486c05d8ece955e808fdbb830c896817bc199bab

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page