Skip to main content

Simulate DNA cloning reactions.

Project description

Last release Python version Documentation Test status Test coverage Last commit

The purpose of this project is to provide the ability to simulate any step involved in cloning DNA constructs, e.g. PCR, restriction digests, Gibson assemblies, Golden Gate assemblies, etc. Some implementation details:

  • DNA is represented with a very high level of detail and generality, so that things like sticky ends, phosphorylated ends, modified nucleotides etc. can be accounted for. For example, something like this:

    class Duplex:
        watson: Strand
        crick: Strand
    
    class Strand:
        # Generally the graph will be a simple linked list, but it is possible
        # for nucleotides to branch.
        polymer: nx.Graph
    
    class Nucleotide:
        # There's a library called `pysmiles` than can construct graphs of
        # atoms from SMILES strings.
        atoms: nx.Graph
    
        # Node indices in the above graph.
        attachment_points: int
    
        # e.g. ATCG
        symbol: str
    
    class DegenerateNucleotide:
        mix: Dict[Nucleotide, float]
    
    def dsdna_from_str(str, *, phos_5, phos_3, sticky_5, sticky_3):
        # I'll want a lot of keyword argument to control how the duplex is
        # constructed.  I've listed a few here, but I haven't really thought
        # about the format they'd take.
        pass
    
    def ssdna_from_str(...):
        pass
    
    def dsrna_from_str(...):
        pass
    
    def ssrna_from_str(...):
        pass

    At the same time, every effort is made to accept simple strings wherever a sequence is required, for convenience.

  • Cloning steps would be implemented as simple functions, for the most part. Maybe in some cases it’d be better to use functors, just to make it easier to pass lots of parameters, but I don’t want to lean into that. Some examples:

    def pcr(template, primer1, primer_2):
        # Check that primers face each other (accounting for circular
        # templates) and that product is unambiguous.
        return Duplex(...)
    
    def gibson(fragments):
        # - Build graph using ends of each fragment
        # - Make sure that the graph is circular, and uses each fragment
        #   exactly once.
        return Duplex(..., circular=True)
    
    def golden_gate(fragments):
        # Similar to Gibson.
        return Duplex(..., circular=True)
    
    def digest(duplex, enzymes):
        return Duplex(...)
    
    def phosphorylate(strand):
        # If duplex provided, phosphorylate both strands.
        return Strand(...)
    
    def ligate(fragments):
        # - Require phosphorylated ends.
        # - Require a single unique product, by default.
        return Duplex(..., circular=True)
    
    def anneal(oligo_1, oligo_2):
        return Duplex(...)
    
    def transcribe(template):
        # Check for promoter.  Maybe optionally require GGG for T7.
        return Strand(..., rna=True)
    
    def express(template, start_codon=0):
        # - Require RNA template
        # - Third party functions should be used to predict start codon from
        #   transcript, if the user needs that necessary.
        return Strand()
  • Some general-purpose tools that I’d like to include:

    • Reverse complement.

    • Translation.

    • Melting temperature calculation (via Biopython).

    • Sequence alignment, especially for circular sequences.

    • Support for parsing IDT sequence strings.

  • Some general purpose tools I’m hesitant to include:

    • Reverse translation: Doing this for any real application is a pretty intense optimization problem, e.g. finding a sequence that uses common codons, avoids restriction sites, minimizes internal RBSs/promoters/terminators, isn’t too complex, etc. I think this should be the domain of a devoted tool.

Some ideas about names:

  • cloning: Shocking that this is available.

  • biopolymers: Might be a better fit for the actual function, since it makes sense to include transcribe() and express() functions.

  • biopol/biopols: Abbreviations of above.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cloning-0.0.0.tar.gz (10.1 kB view hashes)

Uploaded Source

Built Distribution

cloning-0.0.0-py2.py3-none-any.whl (4.0 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page