Skip to main content

Simulate DNA cloning reactions.

Project description

Last release Python version Documentation Test status Test coverage Last commit

The purpose of this project is to provide the ability to simulate any step involved in cloning DNA constructs, e.g. PCR, restriction digests, Gibson assemblies, Golden Gate assemblies, etc. Some implementation details:

  • DNA is represented with a very high level of detail and generality, so that things like sticky ends, phosphorylated ends, modified nucleotides etc. can be accounted for. For example, something like this:

    class Duplex:
        watson: Strand
        crick: Strand
    
    class Strand:
        # Generally the graph will be a simple linked list, but it is possible
        # for nucleotides to branch.
        polymer: nx.Graph
    
    class Nucleotide:
        # There's a library called `pysmiles` than can construct graphs of
        # atoms from SMILES strings.
        atoms: nx.Graph
    
        # Node indices in the above graph.
        attachment_points: int
    
        # e.g. ATCG
        symbol: str
    
    class DegenerateNucleotide:
        mix: Dict[Nucleotide, float]
    
    def dsdna_from_str(str, *, phos_5, phos_3, sticky_5, sticky_3):
        # I'll want a lot of keyword argument to control how the duplex is
        # constructed.  I've listed a few here, but I haven't really thought
        # about the format they'd take.
        pass
    
    def ssdna_from_str(...):
        pass
    
    def dsrna_from_str(...):
        pass
    
    def ssrna_from_str(...):
        pass

    At the same time, every effort is made to accept simple strings wherever a sequence is required, for convenience.

  • Cloning steps would be implemented as simple functions, for the most part. Maybe in some cases it’d be better to use functors, just to make it easier to pass lots of parameters, but I don’t want to lean into that. Some examples:

    def pcr(template, primer1, primer_2):
        # Check that primers face each other (accounting for circular
        # templates) and that product is unambiguous.
        return Duplex(...)
    
    def gibson(fragments):
        # - Build graph using ends of each fragment
        # - Make sure that the graph is circular, and uses each fragment
        #   exactly once.
        return Duplex(..., circular=True)
    
    def golden_gate(fragments):
        # Similar to Gibson.
        return Duplex(..., circular=True)
    
    def digest(duplex, enzymes):
        return Duplex(...)
    
    def phosphorylate(strand):
        # If duplex provided, phosphorylate both strands.
        return Strand(...)
    
    def ligate(fragments):
        # - Require phosphorylated ends.
        # - Require a single unique product, by default.
        return Duplex(..., circular=True)
    
    def anneal(oligo_1, oligo_2):
        return Duplex(...)
    
    def transcribe(template):
        # Check for promoter.  Maybe optionally require GGG for T7.
        return Strand(..., rna=True)
    
    def express(template, start_codon=0):
        # - Require RNA template
        # - Third party functions should be used to predict start codon from
        #   transcript, if the user needs that necessary.
        return Strand()
  • Some general-purpose tools that I’d like to include:

    • Reverse complement.

    • Translation.

    • Melting temperature calculation (via Biopython).

    • Sequence alignment, especially for circular sequences.

    • Support for parsing IDT sequence strings.

  • Some general purpose tools I’m hesitant to include:

    • Reverse translation: Doing this for any real application is a pretty intense optimization problem, e.g. finding a sequence that uses common codons, avoids restriction sites, minimizes internal RBSs/promoters/terminators, isn’t too complex, etc. I think this should be the domain of a devoted tool.

Some ideas about names:

  • cloning: Shocking that this is available.

  • biopolymers: Might be a better fit for the actual function, since it makes sense to include transcribe() and express() functions.

  • biopol/biopols: Abbreviations of above.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cloning-0.0.0.tar.gz (10.1 kB view details)

Uploaded Source

Built Distribution

cloning-0.0.0-py2.py3-none-any.whl (4.0 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file cloning-0.0.0.tar.gz.

File metadata

  • Download URL: cloning-0.0.0.tar.gz
  • Upload date:
  • Size: 10.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.27.1

File hashes

Hashes for cloning-0.0.0.tar.gz
Algorithm Hash digest
SHA256 939c730f06d9fc492df4bfa6debc5eeedd53991c87675ca72daa74eda08ac2c0
MD5 42be6a31ea7c1a7e0ec93f9c9c178d97
BLAKE2b-256 c058ebe612ae350ac7ddc94f92fffc294da8b0e6bf5a27c97f3478d43ab8adc4

See more details on using hashes here.

File details

Details for the file cloning-0.0.0-py2.py3-none-any.whl.

File metadata

  • Download URL: cloning-0.0.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 4.0 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.27.1

File hashes

Hashes for cloning-0.0.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 815943dbc6eb0421738c056dea3e8e937aa4082432f55e5cc6387ac1b19d0cca
MD5 ac302807393e00347bdfe6cc131485cb
BLAKE2b-256 cbd17d4f263d12aae726911855c90b6c95c704f25fb79bae0d95f97cc8ec092a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page