Skip to main content

Generate DNA sequences with specified amino acid, codon, and k-mer frequencies.

Project description

# Freqgen

<p align =”center”> <img src=’https://raw.githubusercontent.com/Lab41/freqgen/master/logo/Freqgen2-01_icon_only.png’ height=”150”> </p>

[![Build Status](https://travis-ci.org/Lab41/freqgen.svg?branch=master)](https://travis-ci.org/Lab41/freqgen) [![Documentation Status](https://readthedocs.org/projects/freqgen/badge/?version=latest)](https://freqgen.readthedocs.io/en/latest/?badge=latest) [![CodeFactor](https://www.codefactor.io/repository/github/lab41/freqgen/badge)](https://www.codefactor.io/repository/github/lab41/freqgen)

Freqgen is a tool to generate coding DNA sequences with specified amino acid usage frequencies or sequence, GC content, codon usage bias, and/or k-mer usage bias. To accomplish this, Freqgen uses genetic algorithms to efficiently search the solution space of possible DNA sequences to find ones that most closely match the desired parameters.

## Features

  • CLI and Python API
  • Can simultaneously match multiple DNA statistics
  • Built-in visualization utility
  • Supports several fitness metrics (and you can bring your own!)

## Installation

Simply run:

$ pip install freqgen

Or, to get the latest (but not necessarily stable) development version:

$ pip install git+https://github.com/Lab41/freqgen.git

## Five-second CLI tutorial

The basic flow of Freqgen can be summarized in three steps:

1. Generate a new amino acid sequence based on the amino acid usage profile of reference sequences. If you already have a specific amino acid sequence in mind (i.e. for synthetic biology uses), skip this step:

$ freqgen aa reference_sequences.fna -o new_sequence.faa -l LENGTH

2. Create a YAML file containing k-mer frequencies for the amino acid sequence’s DNA to have:

$ freqgen featurize reference_sequences.fna -k INT -o reference_freqs.yaml
  1. Generate the DNA sequence coding for the amino acid sequence:

    $ freqgen -t reference_freqs.yaml -s new_sequence.faa -v -o optimized.fna

  2. Visualize the results of the optimization (optional):

    $ freqgen visualize –target reference_freqs.yaml –optimized optimized.fna

## Documentation

Read the full docs over at [freqgen.readthedocs.io](http://freqgen.readthedocs.io).

## Citation

To be determined!

Project details


Release history Release notifications

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for freqgen, version 0.1.0
Filename, size File type Python version Upload date Hashes
Filename, size freqgen-0.1.0-py2.py3-none-any.whl (23.9 kB) File type Wheel Python version py2.py3 Upload date Hashes View hashes
Filename, size freqgen-0.1.0.tar.gz (17.8 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page