Skip to main content

A Python package for DNA-based evolution of protien sequences. Holistic apprOach for SequAnces To underZtand evolutIoN.

Project description

hoatzin

A Python package for DNA-based evolution of protien sequences. Holistic apprOach for SequAnces To underZtand evolutIoN.

Last updated August 2023

Current version: v0.12

Installation

The current stable version of hoatzin is available through GitHub or the Python Package Index (PyPI).

To install from PyPI, run:

pip install idptools-hoatzin

You can also install the current development version from

pip install git+https://git@github.com/idptools/hoatzin

To clone the GitHub repository and gain the ability to modify a local copy of the code, run

git clone https://github.com/idptools/hoatzin.git
cd hoatzin
pip install -e .

Usage

First import hoatzin

from hoatzin import evolve

Evolving Sequences

The evolve.sequence() function lets you evolve protein or DNA sequences. If you input a protein sequence, it will be turned into a DNA sequence using the codon usage frequencies from humans. The probabilities for each mutation then use nucleotide mutation probabilities that are from COSMIC 2023, which examined the frequencies of non-synonymous mutations in the human genome. The evolve.sequence() function requires that you input a sequence as the first argument and then the number of generations as the second argument. 1 DNA mutation per generation is assumed.

sequence='QQQGSRGSGSGRRRGSGSGQGS'
evolved_sequence = evolve.sequence(sequence, number_generations=10)
print(evolved_sequence)

Which would return something like:

QQQGPSGSRNGRRRGFSGGLDS

Optional Arguments:

Using the evolve.sequence() function, you can specify additional parameters. mutations_per_generation - the number of DNA mutations in each 'mutation' generation. mutation_probs - The probabilities of each mutation. You can specify your own dictionary of mutations. See NUCLEOTIDE_MUTATION_PROBS in hoatzin_parameters to specify this. sequence_type - Lets you specify if you want to mutate a DNA sequence or a protein sequence. You must specify as 'nucleotide_sequence' if you are inputting a nucleotide sequence. codon_probs - Lets you specify the probabilities of each codon when going from a protein sequence to a DNA sequence. return_all_seqs - Lets you specify whether to return all sequences generated (one sequence per generation) or just get back a single final sequence.

Example

sequence='QQQGSRGSGSGRRRGSGSGQGS'
evolved_sequence = evolve.sequence(sequence, number_generations=10, return_all_seqs=True)
print(evolved_sequence)

Would return something like...

{'original': 'QQQGSRGSGSGRRRGSGSGQGS', 1: 'QQQGSRGSGSGHRRGSGSGQGS', 2: 'QQQGSRGSGSGHRRGSGSGQGS', 3: 'QQQGSRGSGSGHKRGSGSGQGS', 4: 'QQQGSRGSGSGDKRGSGSGQGS', 5: 'QRQGSRGSGSGDKRGSGSGQGS', 6: 'QRQGSRGSGSGDKRGSGSGQGL', 7: 'QRQGSRGFGSGDKRGSGSGQGL', 8: 'QRQGSRGFRSGDKRGSGSGQGL', 9: 'QREGSRGFRSGDKRGSGSGQGL', 10: 'QREG*RGFRSGDKRGSGSGQGL'}

Copyright

Copyright (c) 2023, Ryan Emenecker - Holehouse Lab

Acknowledgements

Project based on the Computational Molecular Science Python Cookiecutter version 1.1.

Project details


Release history Release notifications | RSS feed

This version

0.12

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

idptools-hoatzin-0.12.tar.gz (7.4 kB view details)

Uploaded Source

Built Distribution

idptools_hoatzin-0.12-py3-none-any.whl (5.8 kB view details)

Uploaded Python 3

File details

Details for the file idptools-hoatzin-0.12.tar.gz.

File metadata

  • Download URL: idptools-hoatzin-0.12.tar.gz
  • Upload date:
  • Size: 7.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.11

File hashes

Hashes for idptools-hoatzin-0.12.tar.gz
Algorithm Hash digest
SHA256 9b1b35fd3329cc533c674b85ef88cbbf62c9936020f01e4f4867abca8a886ad5
MD5 c062393d467cc75172d3b48d2ec91ffe
BLAKE2b-256 6306468ae6f25f5cda06ff9c8093e8090839389a755add7a3e01b21e2121107b

See more details on using hashes here.

File details

Details for the file idptools_hoatzin-0.12-py3-none-any.whl.

File metadata

File hashes

Hashes for idptools_hoatzin-0.12-py3-none-any.whl
Algorithm Hash digest
SHA256 bca183acf4bf72217d7140244025a48738277c83d1f760ad558840f223a5c7bc
MD5 0170b1e93f0fdeb3f6a413ff6d3ea1cd
BLAKE2b-256 40746e87915d57a7e639ef85715c6dcd5e85805cc4c77a4b2fa027b83cf799d2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page