Skip to main content

Manages a complete workflow to analysis the codon usage bias

Project description

BCAW: Automated tool for codon usage bias analysis for molecular evolution

Statement of Need

There are no tools available enable users to run a whole automated workflow for codon usage bias analysis. Using python 3.7 BCAW Tool ( Bio Codon Analysis Workflow Tool ) was developed to address this problem. BCAW Tool manages a complete automated workflow to analyze the codon usage bias for genes and genomes of any organism. With minimum coding skills.

For more details about codon usage bias , and the equations used in BCAWT see.

Dependencies

1- Biopython

2- pandas

3- CAI

4- scipy

5- matplotlib

6- numpy

7- prince

Installation Instructions

Using pip

pip install BCAWT

Note: Python >=3.7 is required.

Contribution Guidelines

Contributions to the software are welcome

For bugs and suggestions, the most effective way is by raising an issue on the github issue tracker. Github allows you to classify your issues so that we know if it is a bug report, feature request or feedback to the authors.

If you wish to contribute some changes to the code then you should submit a pull request How to create a Pull Request? documentation on pull requests

Usage

Auto testing

Note here we try to test the result of BCAW tool and not the modules, for testing the modules in the package use test.py

First download fasta file containing the coding sequence ( you can download any fasta file containing gene sequences to be analyzed from NCBI database).

or just download that file Test file

then run ( It will automatically run a test on the results files ):

from BCAWT import BCAWT_auto_test
BCAWT_auto_test.auto_test(["Ecoli.fasta"])
BCAWT_auto_test.auto_check_files()
>> test is completed 'successfully'

Main Usage

from BCAWT import BCAWT
BCAWT.BCAW(['Ecoli.fasta'],'result_folder',genetic_code_=11,Auto=True)

Input


main_fasta_file (list): list of string of the file's path or file-like object

save_folder_name (str): folder name where the result will be saved

ref_fasta_file (list): list of string of the file's path or file-like object, default = None

Auto (bool): default = False, if ref_fasta_file not None.

genetic_code_ (int) : default = 1, The Genetic Codes number described by [NCBI](https://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi)

Important Note: BCAW tool expect coding sequences as input and not genes, for more information about what the difference between them you can take a look here

To obtain such fasta file for a species of interest

Say that the species of interest is Escherichia coli str. K-12 substr. MG1655:

1- Go to the NCBI database.

2- In the search bar write ( Escherichia coli str. K-12 substr. MG1655, complete genome ).

3- choose one of the results ( depending on what you want in your analysis ).

3- On the right of the page, you will find send to option. From sent to select Coding Sequences then FASTA nucleotides Finally, press on Create File

For NCBI Genomes Download (FTP) FAQ

Output

The expected CSV files output

CSV file name Description
ATCG contains ; gene id, GC, GC1, GC2, GC3, GC12, AT, AT3 A3, T3, C3, G3, GRAVY, AROMO and, Gene Length
CA_RSCU contains ; each RSCU result for each codon in each genes
CA_RSCUcodons contains ; correspondence analysis first 4 axis for each codon
CA_RSCUgenes contains ; correspondence analysis first 4 axis for each gene
CAI contains ; gene id and CAI index
ENc contains ; gene id and ENc index.
P2-index contains ; gene id and P2 index
optimal codons contains; putative optimal codons detected

All output plots from BCAW tool analysis for coding sequence from Escherichia coli

Fig 1

Documentations

1- An intro to the codon usage bias >> CUB introduction

  1. For more information about the equations used to analyze CUB in the BCAW tool >> Equations

  2. For more information about the output >>

  3. For more information about the abbreviations used >> Abbreviations table

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

BCAWT-1.0.0.tar.gz (18.7 kB view details)

Uploaded Source

File details

Details for the file BCAWT-1.0.0.tar.gz.

File metadata

  • Download URL: BCAWT-1.0.0.tar.gz
  • Upload date:
  • Size: 18.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.3

File hashes

Hashes for BCAWT-1.0.0.tar.gz
Algorithm Hash digest
SHA256 d51ef68fe9f9a2bfc44b43d2120373d17a36ab32538b85721fe6a6473deaa993
MD5 3c39c331c643d642d4e035fb0b18d6f4
BLAKE2b-256 3c28f92619f493795805204fd364fb59e9a5097ff67a012ea1d67690ecca94f8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page