Skip to main content

A DNA and protein analyser tool

Project description

DNA and Protein analysis tool : DNAPROT

Alt text

Introduction

This project was made in the context of the CH-200 Practical Programming for chemistry course at EPFL by two second year Bachelor students in Chemistry and Chemical engineering.

This is the link of the GitHub : https://github.com/Juan-Salafranca/Project-Dna

Authors

  • Marie Van Rossum, Bsc in Chemistry and chemical engeneering at EPFL. Profile Picture
  • Juan Salafranca Martinez, Bsc in Chemistry and chemical engeneering at EPFL. Profile Picture

Overview

The Protein Analysis Package is a Python package designed to perform various analyses on DNA and their respective protein sequences. The package includes functions to convert DNA sequences into protein sequences by evaluating all possible frames of DNA. It is also able to analyse the protein sequences and calculate the hydrophobicity score, molecular weight, likelihood of secondary structure configurations (beta-sheet, alpha-helix, and beta-turn), and the retention coefficient in High-Performance Liquid Chromatography (HPLC) based on the amino acid sequences.

Features

  • Start sequence identification: Identify start sequence (ex: Shine Dalgarno) among a long sequence of DNA and give all possible protein convertible sections.
  • DNA to Protein conversion: Convert the DNA into the protein sequences using different reading frames.
  • Hydrophobicity Score Calculation: Compute the hydrophobicity score of a protein sequence using Kyte & Doolittle's scale.
  • Molecular Weight Calculation: Determine the molecular weight of a protein sequence based on the molecular weights of individual amino acids.
  • Secondary Structure Likelihood Calculation: Evaluate the likelihood of a protein sequence forming beta-sheets, alpha-helices, or beta-turn using the Chou and Fasman techniques
  • HPLC Retention Coefficient Calculation: Calculate the retention coefficient of a protein sequence in HPLC using given retention values for amino acids.
  • Polarity evaluation: Polarity score calculation based on the Zimmerman scale.
  • Summarize data into Excel file: All previously calculated data is summarized into an excel sheet.

Installation

pip install DNAPROT

The package uses the following dependencies:

  • Pandas (2.2.2)

  • Openpyxl (3.1.3)

Usage

Attention: for the transcription functions to work, a file containing the genetic code file must be in the same repo as the script. One can use the file provided alongside this package or use a different code for other genetic codes (ex: mitochondrial DNA code, etc...)

A very basic usage for our package would be to generate the protein sequence and properties from a large .fasta file containing the DNA sequence.

Start by importing the package and reading the start sequence. Create different sections from the generated "output.txt" file and then analyse it using the "DNA_ToProtExcl_Analysis" function.

import DNAPROT_analysis as DNAPROT

ReadShineDalgarnoFromFasta('sequence.fasta')
sections = separate_sections('output.txt')
DNA_ToProtExcl_Analysis(sections, section_number=None, output_folder=None)

License

This project is open-source and released under the MIT License.

Sources

This project used different science papers to achieve its goal and determine properties of proteins. These sources can be found in the document: Sources.

Contact

For any questions or issues, please contact juan.salafrancamartinez@epfl.ch or marie.vanrossum@epfl.ch

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

DNAPROT-0.1.0.tar.gz (12.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

DNAPROT-0.1.0-py2.py3-none-any.whl (8.3 kB view details)

Uploaded Python 2Python 3

File details

Details for the file DNAPROT-0.1.0.tar.gz.

File metadata

  • Download URL: DNAPROT-0.1.0.tar.gz
  • Upload date:
  • Size: 12.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.11.5

File hashes

Hashes for DNAPROT-0.1.0.tar.gz
Algorithm Hash digest
SHA256 5cf2a68c7c83b890255272da7859e83a0dc2bcf66b6f7f2f8717f62605524b8a
MD5 6f46bda6a5b7c380f5aa31a9afe2b764
BLAKE2b-256 38829a0ac88656e540bf896497031f04b9fb13a08c25f40fe6e4da2fab42fc25

See more details on using hashes here.

File details

Details for the file DNAPROT-0.1.0-py2.py3-none-any.whl.

File metadata

  • Download URL: DNAPROT-0.1.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 8.3 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.11.5

File hashes

Hashes for DNAPROT-0.1.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 1217f978ef01b3a8ad114b93650a4290abd5e39168d66a6a18ef541c1bffc467
MD5 f53e90ba90d8f8cfa1ce60d07a9e0029
BLAKE2b-256 11192892dc1ec9f4f9f16994735c2fa1a31bcd5fa983e2a9be3e5ed1fbce9b0f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page