Skip to main content

A Multi-Objective algorithm for DNA Design and Assembly

Project description

MOODA: Multi-Objective Optimization for DNA design and assembly

Current version: 0.9.5

build platform anaconda

MOODA is a multi-objective optimisation algorithm for DNA sequence design and assembly.

It takes in input an annotated sequence in GenBank format, and optimize it with respect to user-specified objectives.

Currently, some of the most common common operations in synthetic biology are implemented:

  • The GC content operator reduces the difference between the GC content of a sequence and the GC content set as the target. It introduces silent mutation inside CDSs, to increase or decrease the GC content.

  • The Codon usage operator allows the recoding of CDSs according to the specified codon distribution. At each iteration, a specified number of codons is replaced by synonymous

  • The BlockJoin and BlockSplit operators allow the division of the sequence into blocks, given a minimum and maximum size. After the optimisation, each block is then adapted to the selected assembly method. Currently, only Gibson assembly is supported.

New operators, objective functions or assembly method can be added by extending the Operator, ObjectiveFunction and Assembly classes.

Installation

The easiest and fastest method to use mooda is using Docker:

    docker pull ghcr.io/stracquadaniolab/mooda

You can also install mooda using conda:

    $ conda install -c stracquadaniolab -c bioconda -c conda-forge mooda

or using pip:

    $ pip install mooda-dna

Please note, that pip will not install non Python requirements.

Getting started

A typical mooda analysis consists of 3 steps:

  1. Select a DNA sequence in Genbank format.

  2. Write a MOODA configuration file. A .yaml file defining operators, objective functions, assemblies strategy and their parameters.

  3. Run MOODA.

Example: optimizing GC content, E. coli codon usage, number of fragments and the variance of their length

Create a test directory as follows:

    $ mkdir example-run

Move to your test directory as follows:

    $ cd example-run

Download test data from Github as follows:

    $ curl -LO https://github.com/stracquadaniolab/mooda/raw/master/examples/mooda-example1.tar.gz

Extract test data as follows:

    $ tar xvzf mooda-example1.tar.gz

Run mooda as follows:

    $ docker run -it --rm -v $PWD:$PWD -w $PWD ghcr.io/stracquadaniolab/mooda -ag mo -i seq_5_5.gb  -c config.yaml -p 10 -it 20 -a 100 -mns 200 -mxs 2000 -bss 50 -js 40 -dir example-opt -gf True

Results will be available in the example-opt directory, where you will find:

  • Genbank files of the Pareto optimal sequence.
  • FASTA files with the fragments for Gibson assembly for each Pareto optimal sequence.
  • _logfile.yaml file with information about the analysis.
  • _mooda_result.csv file with objective function value information for each sequence.

Command line options

  • -ag: Algorithm to run can be either mo for Multi-Objective, either mc for Monte Carlo, mo is suggested for long sequences, Monte Carlo for small sequences and codon usage optimization. Default: mo.

  • -i: Input DNA sequence to process.

  • -c: Configuration file to set MOODA operators, objective functions and their parameters.

  • -p: Pool size. The -p parameter should increase with the sequence size. It improves solution quality, however the computing time increase as well.

  • -it: Number of iterations. The -it parameter should increase with the sequence size. It improves solution quality more than -p parameter, however the computing time increase as well.

  • -a: Archive size, amount of non-dominated solutions to store at each algorithm iteration, allow to use smaller values for the pool size.

  • -mns: Sequence block minimum size.

  • -mxs: Sequence block maximum size.

  • -bss: Sequence block step size, define the minimum variance between block lengths. Default: 50.

  • -js: Sequence block assembly overlap size, define the amount of overlap between sequence blocks. Default: 40.

  • -dir: Output directory for MOODA results.

  • -gf: Allow the writing of FASTA and GenBank files, related to MOODA solution if set as True. Default: False.

Authors

Citation

Design and assembly of DNA molecules using multi-objective optimisation. Angelo Gaeta, Valentin Zulkower and Giovanni Stracquadanio. bioRxiv. https://www.biorxiv.org/content/10.1101/761320v1

Issues

Please post an issue to report a bug or request new features.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mooda-dna-0.9.5.tar.gz (21.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mooda_dna-0.9.5-py3-none-any.whl (24.2 kB view details)

Uploaded Python 3

File details

Details for the file mooda-dna-0.9.5.tar.gz.

File metadata

  • Download URL: mooda-dna-0.9.5.tar.gz
  • Upload date:
  • Size: 21.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.8.5

File hashes

Hashes for mooda-dna-0.9.5.tar.gz
Algorithm Hash digest
SHA256 11356f1f133348df4b1715c3cf85f8bad56a5859fc24b36abd4cf3dab60a639a
MD5 34a9983f8b41dfb9ebca8a3ec37b59d3
BLAKE2b-256 794aa90fc1d2acd4565a480e256f35c5c94ae20cf72c6316c7b07b42f0a7fc4a

See more details on using hashes here.

File details

Details for the file mooda_dna-0.9.5-py3-none-any.whl.

File metadata

  • Download URL: mooda_dna-0.9.5-py3-none-any.whl
  • Upload date:
  • Size: 24.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.8.5

File hashes

Hashes for mooda_dna-0.9.5-py3-none-any.whl
Algorithm Hash digest
SHA256 1a00403cd6a7fba9a80f2168b9f8f63b8a669409cbb2dff17b1002a6eec04991
MD5 21dec021fe21db5d09320da30106f087
BLAKE2b-256 660961e9c269cd247405c97613b35cf0cec2498390c727d23b57c5db8faa47b4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page