Skip to main content

software to identify primers that can be used to distinguish genomes

Project description

Logo

primerForge

software to identify primers that can be used to distinguish genomes

Installation

pip installation

pip install primerforge

conda installation

conda install -c conda-forge -c bioconda primerforge

Manual installation

[!NOTE] This might take up to ten minutes.

git clone https://github.com/dr-joe-wirth/primerForge.git
conda env create -f primerForge/environment.yml
conda activate primerforge

Docker Installation

A Docker image for the latest release is available at DockerHub

Checking installation

If primerForge is installed correctly, then the following command should execute without errors:

primerForge --check_install

If you installed manually, you may need to use the following command instead

python primerForge.py --check_install

Running unit tests

In order to run unit tests, install primerForge using the instructions above. You will also need to clone the repository if you haven't already:

git clone https://github.com/dr-joe-wirth/primerForge.git

Once installed and cloned, run the following commands to run the unit tests:

[!NOTE] Running results_test.py may take up to three hours to complete

python3 primerForge/bin/unit_tests/clock_test.py
python3 primerForge/bin/unit_tests/parameters_test.py
python3 primerForge/bin/unit_tests/primer_test.py
python3 primerForge/bin/unit_tests/results_test.py

Usage

usage:
    primerForge [-ioaubfpgtrdnkvh]

required arguments:
    -i, --ingroup        [file] ingroup filename or a file pattern inside double-quotes (eg."*.gbff")

optional arguments: 
    -o, --out            [file] output filename for primer pair data (default: results.tsv)
    -a, --analysis       [file] output basename for primer analysis data (default: distribution)
    -u, --outgroup       [file(s)] outgroup filename or a file pattern inside double-quotes (eg."*.gbff")
    -b, --bad_sizes      [int,int] a range of PCR product lengths that the outgroup cannot produce (default: same as '--pcr_prod')
    -f, --format         [str] file format of the ingroup and outgroup genbank|fasta (default: genbank)
    -p, --primer_len     [int(s)] a single primer length or a range specified as 'min,max' (default: 16,20)
    -g, --gc_range       [float,float] a min and max percent GC specified as a comma separated list (default: 40.0,60.0)
    -t, --tm_range       [float,float] a min and max melting temp (Tm) specified as a comma separated list (default: 55.0,68.0)
    -r, --pcr_prod       [int(s)] a single PCR product length or a range specified as 'min,max' (default: 120,2400)
    -d, --tm_diff        [float] the maximum allowable Tm difference between a pair of primers (default: 5.0)
    -n, --num_threads    [int] the number of threads for parallel processing (default: 1)
    -k, --keep           keep intermediate files (default: False)
    -v, --version        print the version
    -h, --help           print this message
    --check_install      check installation
    --debug              run in debug mode (default: False)

Workflow

flowchart TB
    ingroup[/"ingroup genomes"/]
    ingroup --> A

    %% get unique kmers
    subgraph A["for each genome"]
        uniqKmer["get unique kmers"]
    end

    %% get shared kmers
    sharedKmers(["shared kmers"])
    uniqKmer -- intersection --> sharedKmers

    %% get candidate kmers
    subgraph B["for each genome"]
        subgraph B0["for each kmer start position"]
            subgraph B1["pick one kmer"]
                GC{"GC in
                 range?"}
                Tm{"Tm in
                range?"}
                homo{"repeats
                ≤ 3bp?"}
                hair{"no hairpins?"}
                dime{"no homo-
                dimers?"}
                GC-->Tm-->homo-->hair-->dime
            end
        end
    end

    %% connections up to candidate kmers
    sharedKmers --> B
    dump1[/"dump to file"/]
    sharedKmers --> dump1
    candidates(["unique, shared kmers; one per start position"])
    dime --> candidates

    %% get primer pairs
    subgraph C["for one genome"]
        bin1["bin overlapping kmers (64bp max)"]
        bin2["remove kmers that are
        substrings of other kmers"]
        bin3["get bin pairs"]

        %% evaluate one kmer pair
        subgraph C0["for each bin pair"]
            size{"is PCR
            size ok?"}
            subgraph C1["for each primer pair"]
                prime{"is 3' end
                G or C?"}
                temp{"is Tm
                difference ok?"}
                hetero{"no hetero-
                dimers?"}
            end
            size --> C1
        end
    end

    candPair(["candidate primer pairs"])
    allSharePair(["all shared primer pairs"])

    %% get shared primer pairs
    subgraph D["for each candidate primer pair"]
        subgraph D0["for each other genome"]
            pcr{"is PCR
            size ok?"}
        end
    end

    bin1 --> bin2
    bin2 --> bin3
    bin3 --> C0
    prime --> temp --> hetero --> candPair
    candPair --> D
    pcr --> allSharePair

    %% one pair per bin pair
    subgraph E["for each bin pair"]
        keep["keep only one primer pair"]
    end
    
    selectedSharePair(["selected shared primer pairs"])
    dump2[/"dump to file"/]
    dump3[/"dump to file"/]

    candidates --> dump2
    candidates --> C
    allSharePair --> E
    keep --> selectedSharePair
    selectedSharePair --> dump3

    %% outgroup removal
    outgroup[/"outgroup genomes"/]

    subgraph F["for each outgroup genome"]
        subgraph F0["for each primer pair"]
            ogsize{"PCR size outside
            disallowed range?"}
        end
    end

    selectedSharePair --> F0
    outgroup --> F
    
    final(["final set of primer pairs"])
    ogsize --> final

    write[/"write pairs to file"/]

    final --> write

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

primerforge-1.2.5.tar.gz (32.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

primerforge-1.2.5-py3-none-any.whl (35.3 kB view details)

Uploaded Python 3

File details

Details for the file primerforge-1.2.5.tar.gz.

File metadata

  • Download URL: primerforge-1.2.5.tar.gz
  • Upload date:
  • Size: 32.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for primerforge-1.2.5.tar.gz
Algorithm Hash digest
SHA256 836c2966af6cd13a58c20b9955c26c7b6dca682755b2da6d91e0a8701d81af40
MD5 d78f7df78433ef6a6befee16ab1b949a
BLAKE2b-256 f22da647bc19e7c3767ca8869a55433d1a2c11db830169fbbe9f3afcb5ed27c9

See more details on using hashes here.

File details

Details for the file primerforge-1.2.5-py3-none-any.whl.

File metadata

  • Download URL: primerforge-1.2.5-py3-none-any.whl
  • Upload date:
  • Size: 35.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for primerforge-1.2.5-py3-none-any.whl
Algorithm Hash digest
SHA256 66f33d715b89e4fd5009bdf249f1845321cb64314480b1ac8105a408d58de364
MD5 b3f8dcd4c2095d719a5c0182611c2c88
BLAKE2b-256 79940b464dec475c4122903fdd6db7a3dda58365be5ce15ccb7f554ec115782f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page