A comprehensive bioinformatics package with 21 essential programs
Project description
NGD - Bioinformatics Programs Package
A comprehensive Python package containing 21 essential bioinformatics programs for DNA/RNA analysis, sequence manipulation, and protein structure visualization.
Installation
You can install the package using pip:
pip install ngd
Or install from source:
git clone https://github.com/yourusername/ngd.git
cd ngd
pip install -e .
Available Programs
The package includes 21 comprehensive bioinformatics programs:
1. DNA Manipulation and Translation
- DNA sequence slicing and concatenation
- DNA to RNA transcription
- RNA to protein translation
- Basic sequence operations using Biopython
2. Reading FASTA Files
- Parse and read FASTA format files
- Extract sequence descriptions and data
- Handle multiple sequences in a single file
3. Writing and Reading GenBank Format
- Create GenBank records with annotations
- Write sequences to GenBank format
- Read and parse GenBank files
- Handle sequence metadata
4. Converting FASTA to GenBank with Annotations
- Convert between sequence formats
- Add annotations during conversion
- Batch processing of multiple sequences
- Format validation
5. Adding Features and Annotations to SeqRecord
- Create detailed sequence records
- Add gene features and annotations
- Modify sequence metadata
- Handle complex biological annotations
6. Fetching Sequences from NCBI using Entrez
- Access NCBI databases programmatically
- Retrieve GenBank records by ID
- Extract sequence information
- Handle NCBI API responses
7. Pairwise Sequence Alignment
- Align two DNA sequences
- Calculate alignment scores
- Visualize sequence similarities
- Handle alignment parameters
8. Multiple Sequence Alignment using MUSCLE
- Perform multiple sequence alignment
- Use external MUSCLE tool
- Handle alignment output
- Process alignment results
9. Constructing Phylogenetic Trees
- Build phylogenetic trees from alignments
- Calculate distance matrices
- Create UPGMA trees
- Visualize and save tree structures
10. PDB 3D Structure Visualization
- Download protein structures from PDB
- Parse mmCIF files
- Extract atomic coordinates
- Create 3D structure visualizations
11-21. Additional Bioinformatics Programs
- Placeholder programs ready for expansion
- Space for additional bioinformatics tools
- Extensible framework for new programs
- Ready for custom implementations
Usage
To view the code for any program, use the print_program function:
from ngd.programs import print_program
# Print program 1 (DNA Manipulation)
print_program(1)
# Print program 2 (FASTA Reading)
print_program(2)
# Print program 3 (GenBank Operations)
print_program(3)
# And so on for programs 4-21...
Requirements
The package requires the following dependencies:
- biopython >= 1.79 - Core bioinformatics functionality
- matplotlib >= 3.5.0 - Plotting and visualization
- numpy >= 1.21.0 - Numerical computations
- pandas >= 1.3.0 - Data manipulation
- requests >= 2.25.0 - HTTP requests for NCBI access
Features
- Comprehensive Coverage: From basic DNA manipulation to advanced phylogenetic analysis
- Educational: Well-commented code suitable for learning bioinformatics
- Practical: Real-world applications using popular bioinformatics tools
- Extensible: Easy to modify and extend for specific research needs
- Cross-platform: Works on Windows, macOS, and Linux
- Expandable: Framework supports up to 21 programs with room for growth
Examples
Basic DNA Operations
from ngd.programs import print_program
print_program(1) # DNA manipulation and translation
Sequence Analysis
from ngd.programs import print_program
print_program(7) # Pairwise sequence alignment
Database Access
from ngd.programs import print_program
print_program(6) # NCBI sequence fetching
Placeholder Programs
from ngd.programs import print_program
print_program(11) # Program 11 placeholder
print_program(21) # Program 21 placeholder
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Citation
If you use this package in your research, please cite:
NGD Bioinformatics Package (2024). A comprehensive collection of bioinformatics programs.
Available at: https://pypi.org/project/ngd/
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ngd-0.3.0.tar.gz.
File metadata
- Download URL: ngd-0.3.0.tar.gz
- Upload date:
- Size: 18.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
96c7708da66c68eee26f6adb2abc378db0a7b80f829642b413a377f7b68dc317
|
|
| MD5 |
c1c7bdb50aeda22948fb2dbac4334d95
|
|
| BLAKE2b-256 |
51426f410566a530846e41e1c6680ded34226952e39f38434740792a1afb8057
|
File details
Details for the file ngd-0.3.0-py3-none-any.whl.
File metadata
- Download URL: ngd-0.3.0-py3-none-any.whl
- Upload date:
- Size: 16.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2218da7750589c8fe9ced3954f0bce83c8c824dc9b1e550993e24e560c4cb31e
|
|
| MD5 |
471cf0e20faaee7c757b815b52e6ca97
|
|
| BLAKE2b-256 |
f731b82a72e7d6b133c12081d76227513b351b2c82a4221d22987f99f2b8131a
|