software to identify primers that can be used to distinguish genomes
Project description
primerForge
software to identify primers that can be used to distinguish genomes
Installation
pip installation
pip install primerforge
conda installation
conda install -c bioconda -c conda-forge primerforge
Manual installation
[!NOTE] This might take up to ten minutes.
git clone https://github.com/dr-joe-wirth/primerForge.git
conda env create -f primerForge/environment.yml
conda activate primerforge
Docker Installation
A Docker image for the latest release is available at DockerHub
Checking installation
If primerForge is installed correctly, then the following command should execute without errors:
primerForge --check_install
If you installed manually, you may need to use the following command instead
python primerForge.py --check_install
Usage
usage:
primerForge [-ioaubfpgtrdnkvh]
required arguments:
-i, --ingroup [file] ingroup filename or a file pattern inside double-quotes (eg."*.gbff")
optional arguments:
-o, --out [file] output filename for primer pair data (default: results.tsv)
-a, --analysis [file] output basename for primer analysis data (default: distribution)
-u, --outgroup [file(s)] outgroup filename or a file pattern inside double-quotes (eg."*.gbff")
-b, --bad_sizes [int,int] a range of PCR product lengths that the outgroup cannot produce (default: same as '--pcr_prod')
-f, --format [str] file format of the ingroup and outgroup genbank|fasta (default: genbank)
-p, --primer_len [int(s)] a single primer length or a range specified as 'min,max' (default: 16,20)
-g, --gc_range [float,float] a min and max percent GC specified as a comma separated list (default: 40.0,60.0)
-t, --tm_range [float,float] a min and max melting temp (Tm) specified as a comma separated list (default: 55.0,68.0)
-r, --pcr_prod [int(s)] a single PCR product length or a range specified as 'min,max' (default: 120,2400)
-d, --tm_diff [float] the maximum allowable Tm difference between a pair of primers (default: 5.0)
-n, --num_threads [int] the number of threads for parallel processing (default: 1)
-k, --keep keep intermediate files (default: False)
-v, --version print the version
-h, --help print this message
--check_install check installation
--debug run in debug mode (default: False)
Workflow
flowchart TB
ingroup[/"ingroup genomes"/]
ingroup --> A
%% get unique kmers
subgraph A["for each genome"]
uniqKmer["get unique kmers"]
end
%% get shared kmers
sharedKmers(["shared kmers"])
uniqKmer -- intersection --> sharedKmers
%% get candidate kmers
subgraph B["for each genome"]
subgraph B0["for each kmer start position"]
subgraph B1["pick one kmer"]
GC{"GC in
range?"}
Tm{"Tm in
range?"}
homo{"repeats
≤ 3bp?"}
hair{"no hairpins?"}
dime{"no homo-
dimers?"}
GC-->Tm-->homo-->hair-->dime
end
end
end
%% connections up to candidate kmers
sharedKmers --> B
dump1[/"dump to file"/]
sharedKmers --> dump1
candidates(["unique, shared kmers; one per start position"])
dime --> candidates
%% get primer pairs
subgraph C["for one genome"]
bin1["bin overlapping kmers (64bp max)"]
bin2["get bin pairs"]
candPair(["candidate primer pairs"])
sharePair(["shared primer pairs"])
%% evaluate one kmer pair
subgraph C0["for each bin pair"]
size{"is PCR
size ok?"}
subgraph C4["for each primer pair"]
prime{"is 3' end
G or C?"}
temp{"is Tm
difference ok?"}
hetero{"no hetero-
dimers?"}
end
size --> C4
end
%% get shared primer pairs
subgraph C2["for each candidate primer pair"]
subgraph C3["for each other genome"]
pcr{"PCR size ok?"}
end
end
bin1 --> bin2
bin2 --> C0
prime --> temp --> hetero --> candPair
candPair --> C2
pcr --> sharePair
end
allSharePair(["all shared primer pairs"])
dump2[/"dump to file"/]
dump3[/"dump to file"/]
candidates --> dump2
candidates --> C
sharePair --> allSharePair
allSharePair --> dump3
%% outgroup removal
outgroup[/"outgroup genomes"/]
allSharePair --> D0
outgroup --> D
subgraph D["for each outgroup genome"]
subgraph D0["for each primer pair"]
ogsize{"PCR size outside
disallowed range?"}
end
end
allPairs(["all suitable primer pairs"])
ogsize --> allPairs
%% one pair per bin pair
subgraph E["for each bin pair"]
keep["keep only one primer
pair per bin pair"]
end
allPairs --> E
final(["final set of pairs"])
keep --> final
write[/"write pairs to file"/]
plots[/"make plots"/]
final --> write
final --> plots
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file primerforge-1.1.1.tar.gz.
File metadata
- Download URL: primerforge-1.1.1.tar.gz
- Upload date:
- Size: 31.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5872e5607d0bb5272c86f7a296c1650e97f712dd25fad060ca14dbfaaa8fbe9a
|
|
| MD5 |
d58c8ae538e1e29e318bce122bfcf058
|
|
| BLAKE2b-256 |
fd51164f882ae0ee7af8b54f8fa14011a8cc0a9c09cce3921e2c886e7f118ef4
|
File details
Details for the file primerforge-1.1.1-py3-none-any.whl.
File metadata
- Download URL: primerforge-1.1.1-py3-none-any.whl
- Upload date:
- Size: 34.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
13eb9511d9f5b86f9bc8bf149f0da045eb0a46b1943f8ef4f18e98c58b3fc7f5
|
|
| MD5 |
22ba6e6826dc431986f080500f4cf615
|
|
| BLAKE2b-256 |
64ae4df67c222a7ec8d67e5bf62ff21308e1f7513a82f68a02c34b7623949238
|