Skip to main content

Generate DNA sequences with specified amino acid, codon, and k-mer frequencies.

Project description

# Freqgen

<p align =”center”> <img src=’https://raw.githubusercontent.com/Lab41/freqgen/master/logo/Freqgen2-01_icon_only.png’ height=”150”> </p>

[![Build Status](https://travis-ci.org/Lab41/freqgen.svg?branch=master)](https://travis-ci.org/Lab41/freqgen) [![Documentation Status](https://readthedocs.org/projects/freqgen/badge/?version=latest)](https://freqgen.readthedocs.io/en/latest/?badge=latest) [![CodeFactor](https://www.codefactor.io/repository/github/lab41/freqgen/badge)](https://www.codefactor.io/repository/github/lab41/freqgen)

Freqgen is a tool to generate coding DNA sequences with specified amino acid usage frequencies or sequence, GC content, codon usage bias, and/or k-mer usage bias. To accomplish this, Freqgen uses genetic algorithms to efficiently search the solution space of possible DNA sequences to find ones that most closely match the desired parameters.

## Features

  • CLI and Python API

  • Can simultaneously match multiple DNA statistics

  • Built-in visualization utility

  • Supports several fitness metrics (and you can bring your own!)

## Installation

Simply run:

$ pip install freqgen

Or, to get the latest (but not necessarily stable) development version:

$ pip install git+https://github.com/Lab41/freqgen.git

## Five-second CLI tutorial

The basic flow of Freqgen can be summarized in three steps:

1. Generate a new amino acid sequence based on the amino acid usage profile of reference sequences. If you already have a specific amino acid sequence in mind (i.e. for synthetic biology uses), skip this step:

$ freqgen aa reference_sequences.fna -o new_sequence.faa -l LENGTH

2. Create a YAML file containing k-mer frequencies for the amino acid sequence’s DNA to have:

$ freqgen featurize reference_sequences.fna -k INT -o reference_freqs.yaml

  1. Generate the DNA sequence coding for the amino acid sequence:

    $ freqgen -t reference_freqs.yaml -s new_sequence.faa -v -o optimized.fna

  2. Visualize the results of the optimization (optional):

    $ freqgen visualize –target reference_freqs.yaml –optimized optimized.fna

## Documentation

Read the full docs over at [freqgen.readthedocs.io](http://freqgen.readthedocs.io).

## Citation

To be determined!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

freqgen-0.1.0.tar.gz (17.8 kB view details)

Uploaded Source

Built Distribution

freqgen-0.1.0-py2.py3-none-any.whl (23.9 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file freqgen-0.1.0.tar.gz.

File metadata

  • Download URL: freqgen-0.1.0.tar.gz
  • Upload date:
  • Size: 17.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.0

File hashes

Hashes for freqgen-0.1.0.tar.gz
Algorithm Hash digest
SHA256 d495eff2fd91c5ef274e339640ab794930650a1c8e920acc23fc8cf865378d67
MD5 3deb680f27a31dd35f1cb11bd2f06140
BLAKE2b-256 61811306b2f17ee524b988cf007a446250765b311a5409d686d8f2326e95bb4a

See more details on using hashes here.

File details

Details for the file freqgen-0.1.0-py2.py3-none-any.whl.

File metadata

  • Download URL: freqgen-0.1.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 23.9 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.0

File hashes

Hashes for freqgen-0.1.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 cfa069ee6e03ff10abbdd8f351fadbfbdc785d76dc56554fa2b7a946ddaf7574
MD5 bc559fe2eb643b52f61bbc795498e06c
BLAKE2b-256 ebfd9810d4eb162cffd77dfa095c1ae285ba073d7cdc4579de31dce78767cf74

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page