Tooling for ultra-high throughput screening workflows.

These details have not been verified by PyPI

Project description

uht-tooling

Automation helpers for ultra-high-throughput molecular biology workflows. The package ships both a CLI and an optional GUI that wrap the same workflow code paths.

Installation

Python version: 3.10

Quick install (recommended, easiest file maintainance)

pip install "uht-tooling[gui]"

This installs the core workflows plus the optional GUI dependency (NiceGUI). NiceGUI is already included in the core dependencies, so [gui] is a convenience alias. Omit the [gui] extras if you only need the CLI:

pip install uht-tooling

Legacy Gradio interface: The old Gradio GUI is still available via pip install "uht-tooling[legacy-gui]" and launched with uht-tooling gui --legacy.

External Tools

Some workflows require external bioinformatics tools:

Workflow	Required Tools
mutation-caller	mafft
umi-hunter	mafft
ep-library-profile	minimap2, NanoFilt
ssm-profiler	minimap2

Install via conda:

conda install -c bioconda mafft minimap2 nanofilt

The CLI and GUI will validate tool availability before running and provide clear error messages if tools are missing.

Development install

git clone https://github.com/Matt115A/uht-tooling-packaged.git
cd uht-tooling-packaged
python -m pip install -e ".[gui,dev]"

The editable install exposes the latest sources, while the dev extras add linting and test tooling.

Directory layout

Reference inputs can be found anywhere (you specify in the cli), but we recommend using data/<workflow>/.
Outputs (CSV, FASTA, plots, logs) are written to results/<workflow>/.
All workflows log to results/<workflow>/run.log for reproducibility and debugging.

Command-line interface

The CLI is exposed as the uht-tooling executable. List the available commands:

uht-tooling --help

Each command mirrors a workflow module. Common entry points:

Command	Purpose
`uht-tooling nextera-primers`	Generate Nextera XT primer pairs from a binding-region CSV.
`uht-tooling design-gene-oligos`	Design overlap-extension PCR oligos for IVTT-ready gene constructs.
`uht-tooling design-synthetic-gene-pool`	Design pooled synthetic-gene ordering oligos plus a reusable lift-out primer pair.
`uht-tooling design-slim`	Design SLIM mutagenesis primers from FASTA/CSV inputs.
`uht-tooling design-kld`	Design KLD (inverse PCR) mutagenesis primers.
`uht-tooling design-gibson`	Produce Gibson mutagenesis primers and assembly plans.
`uht-tooling mutation-caller`	Summarise amino-acid substitutions from long-read FASTQ files.
`uht-tooling umi-hunter`	Cluster UMIs and call consensus genes.
`uht-tooling ep-library-profile`	Measure mutation rates in plasmid libraries without UMIs.
`uht-tooling ssm-profiler`	Profile site-saturation libraries at target codons and compare observed vs expected AA distributions.
`uht-tooling profile-inserts`	Extract and analyse inserts defined by flanking probe pairs.

Each command provides detailed help, including option descriptions and expected file formats:

uht-tooling mutation-caller --help

Short Flags

All commands support short flags for common options:

# Long form
uht-tooling design-slim --gene-fasta gene.fa --context-fasta ctx.fa --mutations-csv mut.csv --output-dir out/

# Short form
uht-tooling design-slim -g gene.fa -c ctx.fa -m mut.csv -o out/

Long Flag	Short	Commands
`--gene-fasta`	`-g`	design-slim, design-kld, design-gibson
`--context-fasta`	`-c`	design-slim, design-kld, design-gibson
`--mutations-csv`	`-m`	design-slim, design-kld, design-gibson
`--sequence-fasta`	`-s`	design-gene-oligos
`--sequence-fasta`	`-s`	design-synthetic-gene-pool
`--output-dir`	`-o`	8 commands
`--log-path`	`-l`	8 commands
`--template-fasta`	`-t`	mutation-caller, umi-hunter
`--fastq`	`-q`	5 commands
`--threshold`	`-T`	mutation-caller
`--min-flank-ratio`	`-r`	mutation-caller, umi-hunter
`--min-base-qual`	`-Q`	mutation-caller, ssm-profiler
`--config-csv`	`-C`	umi-hunter
`--binding-csv`	`-b`	nextera-primers
`--probes-csv`	`-P`	profile-inserts
`--region-fasta`	`-R`	ep-library-profile, ssm-profiler
`--plasmid-fasta`	`-p`	ep-library-profile, ssm-profiler
`--work-dir`	`-w`	ep-library-profile, ssm-profiler
`--target-site`	`-t`	ssm-profiler
`--site-scheme`	`-s`	ssm-profiler
`--config`	`-K`	global (all commands)

You can pass multiple FASTQ paths using repeated --fastq options or glob patterns. Optional --log-path flags redirect logs if you prefer a location outside the default results directory.

Configuration File

uht-tooling supports a YAML configuration file for default options.

Auto-discovery locations (in order):

$UHT_TOOLING_CONFIG environment variable
~/.uht-tooling.yaml
~/.config/uht-tooling/config.yaml
.uht-tooling.yaml (current directory)

Or specify explicitly: uht-tooling --config my-config.yaml ...

Example ~/.uht-tooling.yaml:

paths:
  output_dir: ~/results/uht-tooling

defaults:
  design_gene_oligos:
    target_oligo_length: 40
  mutation_caller:
    threshold: 15
  umi_hunter:
    umi_identity_threshold: 0.85
    min_cluster_size: 5

gene_oligos:
  constant_5prime_dna: TAATACGACTCACTATAGGGAGA
  constant_3prime_dna: GCGTTTTTTTTT
  stop_codon: TAA
  tags:
    n_his:
      dna: CATCATCATCATCATCAT
    c_his:
      dna: CATCATCATCATCATCAT
    n_his_flag:
      dna: CATCATCATCATCATCATGACTACAAAGACGATGACGACAAG
    c_his_flag:
      dna: CATCATCATCATCATCATGACTACAAAGACGATGACGACAAG
  codon_table:
    A: GCT
    C: TGT
    D: GAT
    E: GAA
    F: TTT
    G: GGT
    H: CAT
    I: ATT
    K: AAA
    L: CTG
    M: ATG
    N: AAT
    P: CCT
    Q: CAA
    R: CGT
    S: TCT
    T: ACT
    V: GTT
    W: TGG
    Y: TAT

CLI options always take precedence over config values.

Workflow reference

Gene oligo designer

Design one or more IVTT-ready constructs from DNA or protein FASTA input and tile them into overlap-extension PCR oligos. The workflow has built-in T7 defaults for the 5' cassette, 3' terminator, and common His/His+FLAG tags. Protein inputs are automatically codon-optimized for the selected host using dnachisel; DNA inputs are preserved apart from start/stop normalization.

Inputs:

--sequence-fasta — FASTA containing one or more GOIs as DNA or protein
--input-type — auto, dna, or protein
--target-oligo-length — maximum allowed gene-specific oligo length in nt (minimum 20, default 40); the workflow chooses the longest feasible size at or below this limit to minimize primer count
--max-orderonce-length — maximum allowed length for reusable constant oligos (default 80)
--target-host — host identifier used for automatic protein codon optimization (default e_coli)
--tag-mode — none, n_his, c_his, n_his_flag, or c_his_flag
--output-dir — destination for oligos, construct FASTA, and run log

Outputs:

gene_oligos.csv — per-target OE oligos plus reusable external primers, with overlap metrics, constant vs gene_specific role, and order_once vs order_per_target
assembly_report.csv — one row per target with start/stop-codon normalization and recommended assembly strategy
final_constructs.fasta — assembled IVTT-ready constructs for all targets
ordered_oligos.fasta — deduplicated oligos in 5'→3' ordering orientation
ordering_plate.csv — 96-well plate layout at 20 uM stock concentration
external_primers.csv — reusable outer primers for production-scale full-length amplification
assembly_instructions.txt — suggested OE-PCR workflow and block-assembly guidance

Reusable edge oligos are emitted as CONST_EDGE_5P and CONST_EDGE_3P. These are the preferred order_once oligos and are allowed to be longer than the per-target OE oligos so they can absorb the promoter, start codon, terminator, and any configured terminal tag. In the web UI, the maximum allowed length for these reusable oligos is available under the collapsed Extra settings panel.

Example:

uht-tooling --config .uht-tooling.yaml design-gene-oligos \
  --sequence-fasta data/gene_oligos/target.fasta \
  --input-type protein \
  --target-oligo-length 40 \
  --tag-mode n_his \
  --output-dir results/design_gene_oligos/

If you do not provide a gene_oligos: section in your YAML config, the built-in defaults are used automatically. The workflow auto-detects and removes terminal stop codons from the user input, adds a start codon when an untagged or C-terminally tagged construct lacks one, and inserts an initiating ATG immediately before an N-terminal tag. For protein inputs, host-aware codon optimization is applied automatically before oligo design.

Synthetic gene pool

Design a pooled ordering set for E. coli T7 cell-free protein synthesis where each target is represented by one oligo flanked by shared lift-out handles, and a single reusable primer pair adds the final 5' and 3' constant regions plus optional tags.

Inputs:

--sequence-fasta — FASTA containing one or more GOIs as DNA or protein
--input-type — auto, dna, or protein
--tag-mode — none, n_his, c_his, n_his_flag, or c_his_flag
--include-pullout-primers — include deterministic gene-specific pullout reverse primers; each ordered design will carry a built-in unique index just upstream of the terminator
--output-dir — destination for pool-ordering outputs

Outputs:

synthetic_gene_pool.csv — one pooled oligo per target
synthetic_gene_pool_primers.csv — common forward/reverse lift-out primers, plus optional gene-specific pullout reverse primers
synthetic_gene_pool_ordering.tsv — copy/paste-ready ordering list containing all pool oligos and the common primers
synthetic_gene_pool_instructions.txt — brief workflow guidance

Protein inputs are automatically codon-optimized for E. coli. The pooled oligos are kept short by including only the shared anneal handles, while the reusable common primers contribute the full T7/cell-free 5' and 3' payload. In pullout mode, each ordered design also carries a deterministic built-in unique index just upstream of the terminator; after whole-pool amplification with POOL_CONST_F and POOL_CONST_R, use POOL_CONST_F plus the matching *_PULLOUT_R primer to selectively recover one CFPS-ready construct. The common primers must stay below 100 bp; the workflow fails if the configured constant/tag payload would exceed that limit.

Example:

uht-tooling design-synthetic-gene-pool \
  --sequence-fasta data/synthetic_genes/targets.fasta \
  --input-type protein \
  --tag-mode n_his \
  --include-pullout-primers \
  --output-dir results/synthetic_gene_pool/

Nextera XT primer design

Inputs:

--binding-csv — CSV with a binding_region column (row 1 = i7 forward region, row 2 = i5 reverse region, both 5'→3')
--output-csv — path for the output primer CSV

Outputs:

Single CSV with columns [primer_name, sequence]

Prepare data/nextera_designer/nextera_designer.csv with a binding_region column. Row 1 should contain the forward region, row 2 the reverse region, both in 5'→3' orientation.
Optional: supply a YAML overrides file for index lists/prefixes via --config.

Run:

uht-tooling nextera-primers \
  --binding-csv data/nextera_designer/nextera_designer.csv \
  --output-csv results/nextera_designer/nextera_xt_primers.csv

Primer CSVs will be written to results/nextera_designer/, accompanied by a log file.

The helper is preloaded with twelve i5 and twelve i7 indices, enabling up to 144 unique amplicons.

Wet-lab workflow notes

Perform the initial amplification with an i5/i7 primer pair and monitor a small aliquot by qPCR. Cap thermocycling early so you only generate ~10% of the theoretical yield—this minimizes amplification bias.
Purify the product with SPRIselect beads at approximately a 0.65:1 bead:DNA volume ratio to remove residual primers and short fragments.
Confirm primer removal and quantify DNA using electrophoresis (e.g., BioAnalyzer DNA chip) before moving to the flow cell.

SLIM primer design

Inputs:
- data/design_slim/slim_template_gene.fasta
- data/design_slim/slim_context.fasta
- data/design_slim/slim_target_mutations.csv (single mutations column)

Run:

uht-tooling design-slim \
  --gene-fasta data/design_slim/slim_template_gene.fasta \
  --context-fasta data/design_slim/slim_context.fasta \
  --mutations-csv data/design_slim/slim_target_mutations.csv \
  --output-dir results/design_slim/

Outputs:

SLIM_primers.csv — columns [Primer Name, Sequence], 4 primers per mutation (_Lf, _Sr, _Lr, _Sf)

Mutation nomenclature examples:

A123G (substitution)
T241Del (deletion)
T241TS (insert Ser after Thr241)
L46GP (replace Leu46 with Gly-Pro)
A123:NNK (library mutation with degenerate codon)

Library mutations with degenerate codons

For saturation mutagenesis and library generation, SLIM supports degenerate (IUPAC ambiguity) codons using the format <WT_AA><position>:<codon>. The codon must be exactly 3 characters using valid IUPAC nucleotide codes:

Code	Bases	Mnemonic
A, C, G, T	Single base	Standard
R	A, G	puRine
Y	C, T	pYrimidine
S	G, C	Strong
W	A, T	Weak
K	G, T	Keto
M	A, C	aMino
B	C, G, T	not A
D	A, G, T	not C
H	A, C, T	not G
V	A, C, G	not T
N	A, C, G, T	aNy

Common degenerate codon schemes for library construction:

Scheme	Codons	Amino acids	Stop codons	Notes
NNK	32	20	1 (TAG)	Reduced stop codon frequency
NNS	32	20	1 (TAG)	Equivalent to NNK
NNN	64	20	3	All codons, higher stop frequency
NDT	12	12	0	F, L, I, V, Y, H, N, D, C, R, S, G only

Example CSV with mixed mutation types:

mutations
A123G
T50:NNK
S100:NNS
T241Del

The workflow validates that the wild-type amino acid matches the template sequence and logs library coverage information (number of possible codons and amino acids) for each degenerate mutation. Primers are generated with the degenerate bases embedded; reverse primers contain the correct IUPAC reverse complements (e.g., K↔M, R↔Y, S↔S).

Experimental blueprint

Hands-on time is approximately three hours (excluding protein purification), with mutant protein obtainable in roughly three days.
Conduct two PCRs per mutant set: (A) long forward with short reverse and (B) long reverse with short forward.
Combine 10 µL from each PCR with 10 µL H-buffer (150 mM Tris pH 8, 400 mM NaCl, 60 mM EDTA) for a 30 µL annealing reaction: 99 °C for 3 min, then two cycles of 65 °C for 5 min followed by 30 °C for 15 min, hold at 4 °C.
Transform directly into NEB 5-alpha or BL21 (DE3) cells without additional cleanup. The protocol has been validated for simultaneous introduction of dozens of mutations.

KLD primer design

KLD (Kinase-Ligation-DpnI) is an alternative mutagenesis method using inverse PCR to amplify the entire plasmid with mutations incorporated at the primer junction.

Inputs: Same as SLIM design
- data/design_kld/kld_template_gene.fasta
- data/design_kld/kld_context.fasta
- data/design_kld/kld_target_mutations.csv (single mutations column)

Run:

uht-tooling design-kld \
  --gene-fasta data/design_kld/kld_template_gene.fasta \
  --context-fasta data/design_kld/kld_context.fasta \
  --mutations-csv data/design_kld/kld_target_mutations.csv \
  --output-dir results/design_kld/

Outputs:

KLD_primers.csv — columns [Primer Name, Sequence, Tm (binding), GC%, Length, Notes], 2 primers per mutation (_F, _R)

Mutation nomenclature: Same as SLIM (substitution, deletion, insertion, indel, library).

KLD vs SLIM

Method	Primers	Mechanism	Best for
SLIM	4 per mutation	Overlap assembly	Multiple simultaneous mutations
KLD	2 per mutation	Inverse PCR + ligation	Single mutations, simpler workflow

KLD primer design rules

Forward primer: Mutation codon at 5' end + downstream template-binding region
Reverse primer: Reverse complement of upstream region, 5' end adjacent to forward
Tm calculated on template-binding regions only (50-65°C target)
Tm difference between primers kept within 5°C
GC content 40-60%
Binding region 18-24 bp

Experimental workflow

PCR amplify entire plasmid with KLD primer pair
DpnI digest to remove methylated template
T4 PNK phosphorylation of 5' ends
T4 DNA ligase to circularize
Transform into competent cells

NEB sells a KLD Enzyme Mix (M0554) that combines these steps.

Gibson assembly primers

Inputs mirror the SLIM workflow but use data/design_gibson/.
Link sub-mutations with + to specify multi-mutation assemblies (e.g., A123G+T150A).

Run:

uht-tooling design-gibson \
  --gene-fasta data/design_gibson/gibson_template_gene.fasta \
  --context-fasta data/design_gibson/gibson_context.fasta \
  --mutations-csv data/design_gibson/gibson_target_mutations.csv \
  --output-dir results/design_gibson/

Outputs:

Gibson_primers.csv — columns [Group, Submutation, Primer Name, Sequence]
Gibson_assembly_plan.csv — columns [Group, Submutation, PCR_Primer_Forward, PCR_Primer_Reverse, Tm (celsius), Amplicon Size (bp)]

If mutations fall within overlapping primer windows, design sequential reactions.

Mutation caller (no UMIs)

Supply:
- data/mutation_caller/mutation_caller_template.fasta
- data/mutation_caller/mutation_caller.csv with gene_flanks and gene_min_max columns (two rows each).
- One or more FASTQ files via --fastq.

Run:

uht-tooling mutation-caller \
  --template-fasta data/mutation_caller/mutation_caller_template.fasta \
  --flanks-csv data/mutation_caller/mutation_caller.csv \
  --fastq data/mutation_caller/*.fastq.gz \
  --output-dir results/mutation_caller/ \
  --threshold 10

Outputs: per-sample subdirectory containing:

{sample}_aa_substitution_frequency.png — substitution frequency plot with KDE
{sample}_frequent_aa_counts.csv — columns [AA, Count] (filtered by --threshold)
{sample}_cooccurring_AA_baseline.csv — columns [AA1, AA2, Both_Count, AA1_Count, AA2_Count]
{sample}_cooccurring_AA_fisher.csv — columns [AA1, AA2, p-value]
{sample}_report.txt — summary report

Co-occurrence matrices are experimental and are not yet to be relied on.

Flank detection uses fuzzy (Levenshtein-ratio) matching rather than a literal exact-match search, so reads with a sequencing error inside the flank itself are still recovered. Adjust strictness via --min-flank-ratio (0-100, default 80) — lower values recover more reads but raise the risk of matching the wrong location.

By default every aligned base counts toward a codon's substitution call regardless of its Phred quality score. Pass --min-base-qual (default 0, disabled) to require every base of a codon to meet a minimum Phred score before a substitution at that codon is reported for a given read.

UMI Hunter

Inputs: data/umi_hunter/template.fasta, data/umi_hunter/umi_hunter.csv, and FASTQ reads.

Command:

uht-tooling umi-hunter \
  --template-fasta data/umi_hunter/template.fasta \
  --config-csv data/umi_hunter/umi_hunter.csv \
  --fastq data/umi_hunter/*.fastq.gz \
  --output-dir results/umi_hunter/

Tunable parameters include --umi-identity-threshold, --consensus-mutation-threshold, --min-cluster-size, and --min-flank-ratio.
--umi-identity-threshold (0–1) controls how similar two UMIs must be to fall into the same cluster.
--consensus-mutation-threshold (0–1) is the fraction of reads within a cluster that must agree on a base before it is written into the consensus sequence.
--min-cluster-size sets the minimum number of reads required in a cluster before a consensus is generated (smaller clusters remain listed in the raw UMI CSV but no consensus FASTA is produced).
Flank detection uses fuzzy (Levenshtein-ratio) matching rather than a literal exact-match search, so reads with a sequencing error inside the UMI or gene flanks are still recovered. Adjust strictness via --min-flank-ratio (0-100, default 80) — lower values recover more reads but raise the risk of matching the wrong location.

Outputs: per-sample subdirectory containing:

{sample}_UMI_clusters.csv — columns [Cluster Representative, Total Count, Members]
{sample}_gene_consensus.csv — columns [Cluster Representative, Total Count, Consensus Gene, Length Difference, Members]
{sample}_consensuses.fasta — FASTA with consensus sequences (only for clusters ≥ --min-cluster-size)

Please be aware, this toolkit will not scale well beyond around 50k reads/sample. See UMIC-seq pipelines for efficient UMI-gene dictionary generation.

Profile inserts

Prepare data/profile_inserts/sample_probes.csv with upstream and downstream columns.

Run:

uht-tooling profile-inserts \
  --probes-csv data/profile_inserts/sample_probes.csv \
  --fastq data/profile_inserts/*.fastq.gz \
  --output-dir results/profile_inserts/

Outputs:

extracted_inserts.fasta — all extracted insert sequences
qc_report.txt — summary statistics (lengths, GC, duplicates, probe performance)
qc_plots.png — multi-panel QC figure

Adjust fuzzy matching strictness via --min-ratio.

EP library profiler (no UMIs)

Inputs:
- data/ep-library-profile/region_of_interest.fasta
- data/ep-library-profile/plasmid.fasta
- FASTQ inputs (--fastq accepts multiple files)
ROI matching:
- The region-of-interest sequence can be forward, reverse-complemented, or split across the plasmid origin.
- The plasmid FASTA is treated as circular for ROI matching and background exclusion.
- The ROI still needs to be unique within the plasmid; ambiguous multi-hit matches are rejected.

Run:

uht-tooling ep-library-profile \
  --region-fasta data/ep-library-profile/region_of_interest.fasta \
  --plasmid-fasta data/ep-library-profile/plasmid.fasta \
  --fastq data/ep-library-profile/*.fastq.gz \
  --output-dir results/ep-library-profile/

Safety note: --output-dir (and --work-dir if used) must live inside a dedicated workspace containing a .uht_tooling_workspace file. This prevents accidental deletion of unrelated folders. Example:
```
mkdir -p ~/uht_tooling_workspace
touch ~/uht_tooling_workspace/.uht_tooling_workspace
# then use --output-dir ~/uht_tooling_workspace/ep-library-profile/
```

Output structure

Each sample produces an organized output directory:

sample_name/
├── KEY_FINDINGS.txt                    # Lay-user executive summary
├── summary_panels.png                  # Main visualization (PNG)
├── summary_panels.pdf                  # Main visualization (PDF)
├── run.log                             # Analysis log
└── detailed/                           # Technical outputs
    ├── gene_mismatch_rates.csv
    ├── base_distribution.csv
    ├── aa_substitutions.csv            # Protein-coding regions only
    ├── plasmid_coverage.csv
    ├── aa_mutation_distribution.csv
    ├── summary.txt
    └── {sample}_mutation_spectrum.pdf

A top-level master_summary.txt aggregates findings across all samples when multiple FASTQs are processed.

Lambda estimate

The profiler reports a single lambda (mutations per gene copy) derived from the net mismatch rate:

Formula: (hit_rate - bg_rate) × seq_len
Where it appears: panel 4 of summary_panels.png and the Poisson lambda line in KEY_FINDINGS.txt.

The KEY_FINDINGS.txt file provides a plain-language summary including:

Expected AA mutations per gene copy
Poisson-based interpretation (% wild-type, % 1 mutation, % 2+ mutations)
Quality assessment (GOOD/ACCEPTABLE/LOW COVERAGE)

How the mutation rate and AA expectations are derived

Reads are aligned to both the region of interest and the full plasmid. The profiler first locates the ROI on the circular plasmid, allowing forward, reverse-complement, and origin-spanning matches. Mismatches in the ROI define the "target" rate; mismatches elsewhere provide the background.
The per-base background rate is subtracted from the target rate to yield a net nucleotide mutation rate, and the standard deviation reflects binomial sampling and quality-score uncertainty.
The net rate is multiplied by the CDS length to estimate λ_bp (mutations per copy). Monte Carlo simulations then flip random bases, translate the mutated CDS, and count amino-acid differences across 1,000 trials—these drive the AA mutation mean/variance that appear in the panel plot and summary.

SSM profiler

Inputs:
- ROI CDS FASTA
- Full plasmid FASTA
- FASTQ inputs (--fastq accepts multiple files)
- One or more target amino-acid sites via repeated --target-site
- Optional per-site degenerate codon schemes via repeated --site-scheme, e.g. 45:NNK

Run:

uht-tooling ssm-profiler \
  --region-fasta data/ssm-profiler/roi_cds.fasta \
  --plasmid-fasta data/ssm-profiler/plasmid.fasta \
  --target-site 45 --target-site 46 --target-site 47 \
  --site-scheme 45:NNK --site-scheme 46:NNK --site-scheme 47:NNW \
  --fastq data/ssm-profiler/*.fastq.gz \
  --output-dir results/ssm-profiler/

Reports:
- Per-target-site codon-complete coverage and non-reference AA fraction
- Observed AA distributions at each target site
- Observed-vs-expected AA distributions when schemes are supplied
- Off-target mismatch rates elsewhere in the ROI, compared against plasmid background
- A target-site mutational-load summary that only counts the specified codons

By default every aligned base counts toward a target codon's read-out regardless of its Phred quality score. Single-codon amino acids (Cys, Trp under NNK) have no synonymous-codon redundancy, so a single low-quality base call at that position is read out as a false non-reference AA. Pass --min-base-qual (default 0, disabled) to require every base of a target codon to meet a minimum Phred score before the read counts toward that site's distribution.

GUI quick start (optional)

The NiceGUI web frontend wraps the same workflows with an Apple-inspired design and sidebar navigation. Launch it with:

uht-tooling gui

The server binds to http://127.0.0.1:7860 by default. Open that URL in your browser to access the interface.

Navigation

The sidebar organises workflows into two groups:

Primer Design

Nextera XT (/)
SLIM (/slim)
KLD (/kld)
Gibson (/gibson)

Sequencing Analysis

Mutation Caller (/mutation-caller)
UMI Hunter (/umi-hunter)
Profile Inserts (/profile-inserts)
EP Library (/ep-library)
SSM Profiler (/ssm-profiler)

Features

Dark mode toggle — persists across sessions via browser storage.
FASTA paste support — Mutation Caller, UMI Hunter, and EP Library pages accept raw sequence paste in addition to file upload.
Slider controls with live value display — UMI Hunter thresholds, Profile Inserts min-ratio.
Download results as ZIP — output archives mirror the directory structure produced by the CLI.

Legacy Gradio interface

The old Gradio GUI is still available:

pip install "uht-tooling[legacy-gui]"
uht-tooling gui --legacy

Workflow tips

For large FASTQ datasets, the CLI remains the most efficient option (especially for automation or batch processing).

Troubleshooting

Port already bound: the launcher automatically selects the next free port and logs the chosen URL.
Missing dependency: ensure you installed with pip install "uht-tooling[gui]" (or the core package, which already includes NiceGUI).
Stopping the server: press Ctrl+C in the terminal session running uht-tooling gui.

Logging

Every workflow configures logging to the destination output directory. Inspect run.log for command echoes, parameter choices, and any warnings produced during execution. When providing bug reports, include this log file along with input metadata to streamline triage.

Roadmap

Expand CLI coverage to any remaining legacy scripts that are still invoked via make.
Add documentation for automation pipelines and integrate continuous integration tests.

Contributions in the form of bug reports, pull requests, or feature suggestions are welcome. File issues on GitHub with clear reproduction steps and sample data when possible.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.9.2

Jun 23, 2026

0.9.1

Jun 19, 2026

0.9.0

Jun 19, 2026

0.8.0

Jun 17, 2026

0.7.0

Jun 15, 2026

0.6.1

Jun 15, 2026

0.6.0

Jun 15, 2026

0.5.10

Jun 8, 2026

0.5.9

Jun 2, 2026

0.5.8

May 21, 2026

0.5.7

May 21, 2026

0.5.6

May 16, 2026

0.5.5

Feb 9, 2026

0.5.4

Feb 9, 2026

0.5.3

Feb 9, 2026

0.5.2

Feb 9, 2026

0.5.1

Feb 9, 2026

0.4.1

Feb 8, 2026

0.4.0

Feb 8, 2026

0.3.4

Feb 8, 2026

0.3.3

Feb 8, 2026

0.3.2

Feb 8, 2026

0.3.1

Feb 8, 2026

0.3.0

Feb 6, 2026

0.2.0

Feb 5, 2026

0.1.9

Feb 5, 2026

0.1.8

Feb 5, 2026

0.1.7

Nov 15, 2025

0.1.6

Nov 11, 2025

0.1.5

Nov 9, 2025

0.1.4

Nov 9, 2025

0.1.3

Nov 9, 2025

0.1.2

Nov 9, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

uht_tooling-0.9.2.tar.gz (16.9 MB view details)

Uploaded Jun 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

uht_tooling-0.9.2-py3-none-any.whl (17.0 MB view details)

Uploaded Jun 23, 2026 Python 3

File details

Details for the file uht_tooling-0.9.2.tar.gz.

File metadata

Download URL: uht_tooling-0.9.2.tar.gz
Upload date: Jun 23, 2026
Size: 16.9 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.15

File hashes

Hashes for uht_tooling-0.9.2.tar.gz
Algorithm	Hash digest
SHA256	`ee0c9d2d1fdbe8d6469124959d0af2b18617ab48805b1b2ddb0021eee2fb5622`
MD5	`9dbfabb4edbce246ac277b53cc66da0c`
BLAKE2b-256	`3af20983e7353f898f1a5c88ddb56a2a5f3d840c2727e69119cdc57c0d168267`

See more details on using hashes here.

File details

Details for the file uht_tooling-0.9.2-py3-none-any.whl.

File metadata

Download URL: uht_tooling-0.9.2-py3-none-any.whl
Upload date: Jun 23, 2026
Size: 17.0 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.15

File hashes

Hashes for uht_tooling-0.9.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0e03b2cb48108302dbb6ebd35ccd44cacb8dc32430f8a31702462b84eac1ca39`
MD5	`5a9f9a3b1dea33d3edf077c595d276bb`
BLAKE2b-256	`b96e5be761136c2b7f24f305e69a082dd4aa7a66e5dc7881a0d286a28fa062a7`

See more details on using hashes here.

uht-tooling 0.9.2

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

uht-tooling

Installation

Quick install (recommended, easiest file maintainance)

External Tools

Development install

Directory layout

Command-line interface

Short Flags

Configuration File

Workflow reference

Gene oligo designer

Synthetic gene pool

Nextera XT primer design

Wet-lab workflow notes

SLIM primer design

Library mutations with degenerate codons

Experimental blueprint

KLD primer design

KLD vs SLIM

KLD primer design rules

Experimental workflow

Gibson assembly primers

Mutation caller (no UMIs)

UMI Hunter

Profile inserts

EP library profiler (no UMIs)

SSM profiler

GUI quick start (optional)

Navigation

Features

Legacy Gradio interface

Workflow tips

Troubleshooting

Logging

Roadmap

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes