Skip to main content

Tooling for ultra-high throughput screening workflows.

Project description

uht-tooling

Automation helpers for ultra-high-throughput molecular biology workflows. The package ships both a Typer-based CLI and an optional Gradio GUI that wrap the same workflow code paths.


Installation

Quick install (recommended, easiest file maintainance)

python -m pip install "uht-tooling[gui]"

This installs the core workflows plus the optional GUI dependencies (Gradio, pandas). Omit the [gui] extras if you only need the CLI:

python -m pip install uht-tooling

Development install

git clone https://github.com/Matt115A/uht-tooling.git
cd uht-tooling
python -m pip install -e ".[gui,dev]"

The editable install exposes the latest sources, while the dev extras add linting and test tooling.


Directory layout

  • Reference inputs live under data/<workflow>/.
  • Outputs (CSV, FASTA, plots, logs) are written to results/<workflow>/.
  • All workflows log to results/<workflow>/run.log for reproducibility and debugging.

Command-line interface

The CLI is exposed as the uht-tooling executable. List the available commands:

uht-tooling --help

Each command mirrors a workflow module. Common entry points:

Command Purpose
uht-tooling nextera-primers Generate Nextera XT primer pairs from a binding-region CSV.
uht-tooling design-slim Design SLIM mutagenesis primers from FASTA/CSV inputs.
uht-tooling design-gibson Produce Gibson mutagenesis primers and assembly plans.
uht-tooling mutation-caller Summarise amino-acid substitutions from long-read FASTQ files.
uht-tooling umi-hunter Cluster UMIs and call consensus alleles.
uht-tooling ep-library-profile Measure mutation rates without UMIs.
uht-tooling profile-inserts Extract inserts defined by probe pairs.

Each command provides detailed help, including option descriptions and expected file formats:

uht-tooling mutation-caller --help

You can pass multiple FASTQ paths using repeated --fastq options or glob patterns. Optional --log-path flags redirect logs if you prefer a location outside the default results directory.


Workflow reference

Nextera XT primer design

  1. Prepare data/nextera_designer/nextera_designer.csv with a binding_region column. Row 1 should contain the forward region, row 2 the reverse region, both in 5'→3' orientation.
  2. Optional: supply a YAML overrides file for index lists/prefixes via --config.
  3. Run:
    uht-tooling nextera-primers \
      --binding-csv data/nextera_designer/nextera_designer.csv \
      --output-csv results/nextera_designer/nextera_xt_primers.csv
    
  4. Primer CSVs will be written to results/nextera_designer/, accompanied by a log file.

The helper is preloaded with twelve i5 and twelve i7 indices, enabling up to 144 unique amplicons. Downstream lab workflow suggestions (qPCR monitoring, SPRIselect cleanup) remain unchanged from earlier releases.

Wet-lab workflow notes

  • Perform the initial amplification with an i5/i7 primer pair and monitor a small aliquot by qPCR. Cap thermocycling early so you only generate ~10% of the theoretical yield—this minimizes amplification bias.
  • Purify the product with SPRIselect beads at approximately a 0.65:1 bead:DNA volume ratio to remove residual primers and short fragments.
  • Confirm primer removal using electrophoresis (e.g., BioAnalyzer DNA chip) before moving to sequencing prep.

SLIM primer design

  • Inputs:
    • data/design_slim/slim_template_gene.fasta
    • data/design_slim/slim_context.fasta
    • data/design_slim/slim_target_mutations.csv (single mutations column)
  • Run:
    uht-tooling design-slim \
      --gene-fasta data/design_slim/slim_template_gene.fasta \
      --context-fasta data/design_slim/slim_context.fasta \
      --mutations-csv data/design_slim/slim_target_mutations.csv \
      --output-dir results/design_slim/
    
  • Output: results/design_slim/SLIM_primers.csv plus logs.

Mutation nomenclature examples:

  • A123G (substitution)
  • T241Del (deletion)
  • T241TS (insert Ser after Thr241)
  • L46GP (replace Leu46 with Gly-Pro)

Experimental blueprint

  • Hands-on time is approximately three hours (excluding protein purification), with mutant protein obtainable in roughly three days.
  • Conduct two PCRs per mutant set: (A) long forward with short reverse and (B) long reverse with short forward.
  • Combine 10 µL from each PCR with 10 µL H-buffer (150 mM Tris pH 8, 400 mM NaCl, 60 mM EDTA) for a 30 µL annealing reaction: 99 °C for 3 min, then two cycles of 65 °C for 5 min followed by 30 °C for 15 min, hold at 4 °C.
  • Transform directly into NEB 5-alpha or BL21 (DE3) cells without additional cleanup. The protocol has been validated for simultaneous introduction of dozens of mutations.

Gibson assembly primers

  • Inputs mirror the SLIM workflow but use data/design_gibson/.
  • Link sub-mutations with + to specify multi-mutation assemblies (e.g., A123G+T150A).
  • Run:
    uht-tooling design-gibson \
      --gene-fasta data/design_gibson/gibson_template_gene.fasta \
      --context-fasta data/design_gibson/gibson_context.fasta \
      --mutations-csv data/design_gibson/gibson_target_mutations.csv \
      --output-dir results/design_gibson/
    
  • Outputs include primer sets and an assembly-plan CSV.

If mutations fall within overlapping primer windows, design sequential reactions to avoid excessive primer reuse.

Mutation caller (no UMIs)

  1. Supply:
    • data/mutation_caller/mutation_caller_template.fasta
    • data/mutation_caller/mutation_caller.csv with gene_flanks and gene_min_max columns (two rows each).
    • One or more FASTQ files via --fastq.
  2. Run:
    uht-tooling mutation-caller \
      --template-fasta data/mutation_caller/mutation_caller_template.fasta \
      --flanks-csv data/mutation_caller/mutation_caller.csv \
      --fastq data/mutation_caller/*.fastq.gz \
      --output-dir results/mutation_caller/ \
      --threshold 10
    
  3. Outputs: per-sample subdirectories with substitution summaries, co-occurrence matrices, and logs.

UMI Hunter

  • Inputs: data/umi_hunter/template.fasta, data/umi_hunter/umi_hunter.csv, and FASTQ reads.
  • Command:
    uht-tooling umi-hunter \
      --template-fasta data/umi_hunter/template.fasta \
      --config-csv data/umi_hunter/umi_hunter.csv \
      --fastq data/umi_hunter/*.fastq.gz \
      --output-dir results/umi_hunter/
    
  • Tunable parameters include --umi-identity-threshold and --consensus-mutation-threshold.

Profile inserts

  • Prepare data/profile_inserts/sample_probes.csv with upstream and downstream columns.
  • Run:
    uht-tooling profile-inserts \
      --probes-csv data/profile_inserts/sample_probes.csv \
      --fastq data/profile_inserts/*.fastq.gz \
      --output-dir results/profile_inserts/
    
  • Outputs: extracted insert FASTA files, QC plots, metrics, and logs. Adjust fuzzy matching strictness via --min-ratio.

EP library profiler (no UMIs)

  • Inputs:
    • data/ep-library-profile/region_of_interest.fasta
    • data/ep-library-profile/plasmid.fasta
    • FASTQ inputs (--fastq accepts multiple files)
  • Run:
    uht-tooling ep-library-profile \
      --region-fasta data/ep-library-profile/region_of_interest.fasta \
      --plasmid-fasta data/ep-library-profile/plasmid.fasta \
      --fastq data/ep-library-profile/*.fastq.gz \
      --output-dir results/ep-library-profile/
    
  • Output bundle includes per-sample directories and a master summary TSV.

GUI quick start (optional)

The Gradio GUI wraps the same workflows with upload widgets and result previews. Launch it directly:

python -m uht_tooling.workflows.gui

Key points:

  • The server binds to http://127.0.0.1:7860 by default and falls back to an available port if 7860 is busy. Copy http://127.0.0.1:7860 into your browser.
  • Temporary working directories are created under the system temp folder and cleaned automatically.
  • Output archives (ZIP files) mirror the directory structure produced by the CLI.

Tabs and capabilities

  1. Nextera XT – forward/reverse primer inputs with CSV preview.
  2. SLIM – template/context FASTA text areas plus mutation list.
  3. Gibson – multi-mutation support using + syntax.
  4. Mutation Caller – upload FASTQ, template FASTA, and configuration CSV.
  5. UMI Hunter – long-read UMI clustering with configurable thresholds.
  6. Profile Inserts – probe CSV and multiple FASTQ uploads.
  7. EP Library Profile – FASTQ uploads plus plasmid and region FASTA inputs.

Workflow tips

  • For large FASTQ datasets, the CLI remains the most efficient option (especially for automation or batch processing).
  • Use the command-line flag --share in python -m uht_tooling.workflows.gui if you need to expose the GUI outside localhost.

Troubleshooting

  • Port already bound: the launcher automatically selects the next free port and logs the chosen URL.
  • Missing dependency: ensure you installed with pip install "uht-tooling[gui]".
  • Stopping the server: press Ctrl+C in the terminal session running the GUI.

Logging

Every workflow configures logging to the destination output directory. Inspect run.log for command echoes, parameter choices, and any warnings produced during execution. When providing bug reports, include this log file along with input metadata to streamline triage.


Roadmap

  • Replace deprecated Biopython command-line wrappers with native subprocess implementations.
  • Expand CLI coverage to any remaining legacy scripts that are still invoked via make.
  • Add documentation for automation pipelines and integrate continuous integration tests.

Contributions in the form of bug reports, pull requests, or feature suggestions are welcome. File issues on GitHub with clear reproduction steps and sample data when possible.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

uht_tooling-0.1.2.tar.gz (56.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

uht_tooling-0.1.2-py3-none-any.whl (61.3 kB view details)

Uploaded Python 3

File details

Details for the file uht_tooling-0.1.2.tar.gz.

File metadata

  • Download URL: uht_tooling-0.1.2.tar.gz
  • Upload date:
  • Size: 56.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.15

File hashes

Hashes for uht_tooling-0.1.2.tar.gz
Algorithm Hash digest
SHA256 7c6da9349604182f38aaa7649192b8749773e5f613f327191745a2190f44d5a4
MD5 80b97c5979cbbdd7d5b0fbe6e9397497
BLAKE2b-256 bc3012418ecf6e87f524fceee8e00a1367df89a3e00ef46533a93d4604e681ae

See more details on using hashes here.

File details

Details for the file uht_tooling-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: uht_tooling-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 61.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.15

File hashes

Hashes for uht_tooling-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 c03f8a6c302433786301b4fb3ebb3dd5201eada737246a2bf375b0cf09c96ef2
MD5 4f9037dee8646eed59a889a1c5aa8fbc
BLAKE2b-256 5c415a25d6a2076ddbed9df101130e674861c6aadde78396a1e4332707d2e72c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page