A package to analyze architecture and optimize model training
Project description
TrainSense
TrainSense is an ultra robust Python package for deep analysis of deep learning model architectures, advanced hyperparameter optimization, and comprehensive system diagnostics. It can automatically detect various layer types (e.g., LSTM, RNN, CNN, Transformer, GPT, etc.), provide detailed hyperparameter recommendations, profile model performance, and retrieve system information (including CUDA and cuDNN versions).
Features
-
Architecture Analysis
Detects the number of parameters, counts layers, identifies layer types, and infers the overall architecture (e.g., CNN, LSTM, Transformer). -
Hyperparameter Evaluation & Optimization
Checks and recommends adjustments for batch size, learning rate, and epochs based on your system configuration and model complexity. It even provides automatic adjustments. -
Model Profiling
Benchmarks the model by measuring average inference time, throughput, and (if applicable) GPU memory usage. -
System Diagnostics
Retrieves detailed system information including CPU count, total memory, GPU details, CUDA version, cuDNN version, OS information, and real-time usage statistics. -
Enhanced Logging
Provides enriched, timestamped logs of all analysis steps for better traceability and debugging.
Installation
Prerequisites
Install via PyPI
pip install trainsense
Install in Development Mode
Clone the repository, navigate to the root directory (which contains setup.py), and run:
pip install -e .
How It Works
TrainSense is composed of several modules that work together to provide a complete analysis of your model and system:
-
System Configuration & Diagnostics
- SystemConfig: Retrieves hardware details such as CPU count, total memory, GPU info, CUDA and cuDNN versions, and OS details.
- SystemDiagnostics: Provides real-time usage statistics like CPU usage, memory usage, disk usage, and uptime.
-
Architecture Analysis
- ArchitectureAnalyzer: Inspects your model to count parameters, layers, and detect layer types. It also infers the model architecture (e.g., CNN, LSTM) and provides recommendations based on complexity.
-
Hyperparameter Analysis
- TrainingAnalyzer: Evaluates your hyperparameters (batch size, learning rate, epochs) in light of your system and model architecture. It provides detailed recommendations and can automatically suggest adjustments.
-
Model Profiling & Optimization
- ModelProfiler: Benchmarks the model to measure average inference time and throughput.
- OptimizerHelper & UltraOptimizer: Offer suggestions on which optimizer to use and compute optimal hyperparameters based on your training data size, model complexity, and system resources.
-
Deep Analysis
- DeepAnalyzer: Combines results from all modules to generate a comprehensive report with overall recommendations and key performance metrics.
-
Logging
- TrainLogger: Captures detailed logs with timestamps, making it easier to trace and debug each step of the analysis.
Usage Example
Below is a complete example demonstrating how to integrate TrainSense into your deep learning workflow.
import torch
import torch.nn as nn
from TrainSense.system_config import SystemConfig
from TrainSense.system_diagnostics import SystemDiagnostics
from TrainSense.analyzer import TrainingAnalyzer
from TrainSense.arch_analyzer import ArchitectureAnalyzer
from TrainSense.deep_analyzer import DeepAnalyzer
from TrainSense.logger import TrainLogger
from TrainSense.model_profiler import ModelProfiler
from TrainSense.optimizer import OptimizerHelper
from TrainSense.ultra_optimizer import UltraOptimizer
from TrainSense.gpu_monitor import GPUMonitor
from TrainSense.utils import print_section
def main():
# Retrieve system configuration and diagnostics
sys_config = SystemConfig()
sys_diag = SystemDiagnostics()
# Define initial hyperparameters
batch_size = 64
learning_rate = 0.05
epochs = 30
# Create a sample CNN model (for image classification)
model = nn.Sequential(
nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1),
nn.ReLU(),
nn.MaxPool2d(2),
nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1),
nn.ReLU(),
nn.MaxPool2d(2),
nn.Flatten(),
nn.Linear(128 * 8 * 8, 10)
)
# Analyze the model architecture
arch_analyzer = ArchitectureAnalyzer(model)
arch_info = arch_analyzer.analyze()
# Analyze the hyperparameters based on system config and architecture
analyzer = TrainingAnalyzer(batch_size, learning_rate, epochs, system_config=sys_config, arch_info=arch_info)
# Profile the model performance
profiler = ModelProfiler(model, device="cpu")
# Ultra-optimize hyperparameters based on training data stats and system resources
ultra_opt = UltraOptimizer({"data_size": 2000000}, arch_info, {"total_memory_gb": sys_config.total_memory})
# Combine all analyses into a deep report
deep_analyzer = DeepAnalyzer(analyzer, arch_analyzer, profiler, sys_diag)
# Display configuration summary
print_section("Configuration Summary")
summary = analyzer.summary()
for k, v in summary.items():
print(f"{k}: {v}")
# Display detailed hyperparameter recommendations
print_section("Hyperparameter Recommendations")
recommendations = analyzer.check_hyperparams()
for r in recommendations:
print(r)
# Show automatic adjustments suggestions
print_section("Proposed Automatic Adjustments")
adjustments = analyzer.auto_adjust()
for k, v in adjustments.items():
print(f"{k}: {v}")
# Log the start of the complete analysis
logger = TrainLogger(log_file="logs/trainsense.log")
logger.log_info("Starting complete and detailed analysis.")
# Suggest an optimizer based on model complexity
opt_adv = OptimizerHelper.suggest_optimizer(arch_info.get("total_parameters", 0), arch_info.get("layer_count", 0))
print_section("Basic Optimizer Recommendation")
print("Recommended Optimizer:", opt_adv)
logger.log_info(f"Suggested Optimizer: {opt_adv}")
# Compute ultra-optimized hyperparameters
ultra_params = ultra_opt.compute_optimal_hyperparams()
print_section("Ultra Optimized Hyperparameters")
for k, v in ultra_params.items():
print(f"{k}: {v}")
# Display GPU status (if available)
try:
gpu_monitor = GPUMonitor()
gpu_status = gpu_monitor.get_gpu_status()
print_section("GPU Status")
for gpu in gpu_status:
print(gpu)
except ImportError:
print("GPUtil not installed. GPU status unavailable.")
# Generate a comprehensive deep analysis report
report = deep_analyzer.comprehensive_report()
print_section("Comprehensive Deep Analysis Report")
for key, value in report.items():
print(f"{key}: {value}")
# Adjust learning rate based on performance throughput
new_lr, tune_msg = OptimizerHelper.adjust_learning_rate(learning_rate, report["profiling"]["throughput"])
print_section("Learning Rate Adjustment Based on Performance")
print("New Learning Rate:", new_lr, "-", tune_msg)
if __name__ == "__main__":
main()
Explanation
-
Configuration and Diagnostics:
TheSystemConfigandSystemDiagnosticsmodules collect your hardware and system usage data, including details about GPUs, CUDA/cuDNN versions, and OS information. -
Model Architecture Analysis:
TheArchitectureAnalyzerinspects your model to count parameters, layers, and detect specific layer types. It infers the overall architecture (e.g., CNN, LSTM) and provides tailored recommendations. -
Hyperparameter Analysis:
TrainingAnalyzeruses system and architecture info to verify that your chosen batch size, learning rate, and epochs are appropriate. It offers detailed recommendations and can automatically suggest adjustments. -
Performance Profiling:
TheModelProfilerbenchmarks your model by measuring inference speed and throughput. This helps in fine-tuning hyperparameters further. -
Advanced Optimization:
OptimizerHelperandUltraOptimizerprovide further recommendations on optimizer choice and compute optimal hyperparameters based on your model and system stats. -
Deep Analysis Report:
DeepAnalyzercompiles all the above information into a comprehensive report with overall recommendations and performance metrics. -
Logging:
TheTrainLoggerwrites detailed, timestamped logs of each step, which is useful for debugging and tracking changes.
Integration
To integrate TrainSense into your project, simply install it via PyPI or in development mode. Then import the modules you need and incorporate them into your training pipeline. Use the example above as a guide to generate reports, adjust your hyperparameters, and monitor your system’s performance throughout model training.
Contributing
Contributions are welcome! Please fork the repository, create your feature branch, commit your changes, and submit a pull request.
License
This project is licensed under the MIT License. See the LICENSE file for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file trainsense-0.0.1.tar.gz.
File metadata
- Download URL: trainsense-0.0.1.tar.gz
- Upload date:
- Size: 12.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f9842d6fcdf69aacd74f3d2fb6b09ca50d2e1aa790338f03371177a3eb3b9d68
|
|
| MD5 |
0d81fd1131e5d2cf2cc33ab4bffbd616
|
|
| BLAKE2b-256 |
73c9785cc5a730d7390bb13897c4672856c58338da0c6f2f4786a8a86f516ee8
|
File details
Details for the file trainsense-0.0.1-py3-none-any.whl.
File metadata
- Download URL: trainsense-0.0.1-py3-none-any.whl
- Upload date:
- Size: 12.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f0026b34bc408ba76ca989f5e9d25e7be55316090522d6689d1b6fd9b5490351
|
|
| MD5 |
784d6671e71716630a19e788c14d8e91
|
|
| BLAKE2b-256 |
89b42c66c6b06c757e51cd3119278bc64e6bb3b85c8324644347d657db7ceb00
|