Skip to main content

biotoolkit contains GCfromDNA: Analyze GC-content of DNA sequences from fasta files

Project description

biotoolkit contains:

GCfromDNA.py


GCfromDNA is a python module/command line utility for determining the GC content of DNA
sequences, as read from standard fasta (.fa) files.
This module formats each sequence name (replacing spaces with underscores), calculates the
percentage GC-content for each sequence, and saves this data as a csv file with a modified
version of the original filename (e.g., fasta_file_1.fa --> fasta_file_1_GC.csv).

GCfromDNA utilizes parts of the biopython package for analysis. The package
can be installed with the following command:

$ pip install biopython

To install biotoolkit:

$ pip install biotoolkit

Upon successful installation, the module can be imported into new python scripts with:

from biotoolkit import GCfromDNA

For instance, the following is a complete python script that will invoke the module's functionality:

#!/usr/bin/env python
from biotoolkit import GCfromDNA
GCfromDNA.compute()

It is possible to pass fasta filenames as parameters directly to the compute function.
Filenames must be passed to the compute function as a list:

GCfromDNA.compute( [file1,file2,file3] ) #if file1 == 'file1.fa', for instance

GCfromDNA can be utilized interactively. From the command line, enter:

$ python
>>> from biotoolkit import GCfromDNA
>>> GCfromDNA.compute()

The script can be invoked directly from the directory where it resides.
Optional filename arguments must be separated by spaces:

$ python GCfromDNA.py file1.fa file2.fa file3.fa

or alternatively:

$ python GCfromDNA.py *.fa

If no filename is provided, the program prompts the user to select from fasta files (located
in the current directory) for processing.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

biotoolkit-0.1.1.tar.gz (2.9 kB view details)

Uploaded Source

File details

Details for the file biotoolkit-0.1.1.tar.gz.

File metadata

  • Download URL: biotoolkit-0.1.1.tar.gz
  • Upload date:
  • Size: 2.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for biotoolkit-0.1.1.tar.gz
Algorithm Hash digest
SHA256 be04f06f5fca9a68d293b790fd016e363ef753e99bb4d6733738cf8084303c40
MD5 b583562c356d2f3c187a1304e1a99178
BLAKE2b-256 a36cdfcb37705c596cb679c82d0c6ab9761da6f76cbbe412e427494bb45bdbd6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page