Skip to main content

A package to analyze architecture and optimize model training

Project description

TrainSense

TrainSense is an ultra robust Python package for deep analysis of deep learning model architectures, advanced hyperparameter optimization, and comprehensive system diagnostics. It can automatically detect various layer types (e.g., LSTM, RNN, CNN, Transformer, GPT, etc.), provide detailed hyperparameter recommendations, profile model performance, and retrieve system information (including CUDA and cuDNN versions).

Features

  • Architecture Analysis
    Detects the number of parameters, counts layers, identifies layer types, and infers the overall architecture (e.g., CNN, LSTM, Transformer).

  • Hyperparameter Evaluation & Optimization
    Checks and recommends adjustments for batch size, learning rate, and epochs based on your system configuration and model complexity. It even provides automatic adjustments.

  • Model Profiling
    Benchmarks the model by measuring average inference time, throughput, and (if applicable) GPU memory usage.

  • System Diagnostics
    Retrieves detailed system information including CPU count, total memory, GPU details, CUDA version, cuDNN version, OS information, and real-time usage statistics.

  • Enhanced Logging
    Provides enriched, timestamped logs of all analysis steps for better traceability and debugging.

Installation

Prerequisites

  • Python 3.7 or newer
  • PyTorch (install via pip or conda)
  • psutil
  • GPUtil (optional, for GPU monitoring)

Install via PyPI

pip install trainsense

Install in Development Mode

Clone the repository, navigate to the root directory (which contains setup.py), and run:

pip install -e .

How It Works

TrainSense is composed of several modules that work together to provide a complete analysis of your model and system:

  1. System Configuration & Diagnostics

    • SystemConfig: Retrieves hardware details such as CPU count, total memory, GPU info, CUDA and cuDNN versions, and OS details.
    • SystemDiagnostics: Provides real-time usage statistics like CPU usage, memory usage, disk usage, and uptime.
  2. Architecture Analysis

    • ArchitectureAnalyzer: Inspects your model to count parameters, layers, and detect layer types. It also infers the model architecture (e.g., CNN, LSTM) and provides recommendations based on complexity.
  3. Hyperparameter Analysis

    • TrainingAnalyzer: Evaluates your hyperparameters (batch size, learning rate, epochs) in light of your system and model architecture. It provides detailed recommendations and can automatically suggest adjustments.
  4. Model Profiling & Optimization

    • ModelProfiler: Benchmarks the model to measure average inference time and throughput.
    • OptimizerHelper & UltraOptimizer: Offer suggestions on which optimizer to use and compute optimal hyperparameters based on your training data size, model complexity, and system resources.
  5. Deep Analysis

    • DeepAnalyzer: Combines results from all modules to generate a comprehensive report with overall recommendations and key performance metrics.
  6. Logging

    • TrainLogger: Captures detailed logs with timestamps, making it easier to trace and debug each step of the analysis.

Usage Example

Below is a complete example demonstrating how to integrate TrainSense into your deep learning workflow.

import torch
import torch.nn as nn
from TrainSense.system_config import SystemConfig
from TrainSense.system_diagnostics import SystemDiagnostics
from TrainSense.analyzer import TrainingAnalyzer
from TrainSense.arch_analyzer import ArchitectureAnalyzer
from TrainSense.deep_analyzer import DeepAnalyzer
from TrainSense.logger import TrainLogger
from TrainSense.model_profiler import ModelProfiler
from TrainSense.optimizer import OptimizerHelper
from TrainSense.ultra_optimizer import UltraOptimizer
from TrainSense.gpu_monitor import GPUMonitor
from TrainSense.utils import print_section

def main():
    # Retrieve system configuration and diagnostics
    sys_config = SystemConfig()
    sys_diag = SystemDiagnostics()

    # Define initial hyperparameters
    batch_size = 64
    learning_rate = 0.05
    epochs = 30

    # Create a sample CNN model (for image classification)
    model = nn.Sequential(
        nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1),
        nn.ReLU(),
        nn.MaxPool2d(2),
        nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1),
        nn.ReLU(),
        nn.MaxPool2d(2),
        nn.Flatten(),
        nn.Linear(128 * 8 * 8, 10)
    )

    # Analyze the model architecture
    arch_analyzer = ArchitectureAnalyzer(model)
    arch_info = arch_analyzer.analyze()

    # Analyze the hyperparameters based on system config and architecture
    analyzer = TrainingAnalyzer(batch_size, learning_rate, epochs, system_config=sys_config, arch_info=arch_info)

    # Profile the model performance
    profiler = ModelProfiler(model, device="cpu")

    # Ultra-optimize hyperparameters based on training data stats and system resources
    ultra_opt = UltraOptimizer({"data_size": 2000000}, arch_info, {"total_memory_gb": sys_config.total_memory})

    # Combine all analyses into a deep report
    deep_analyzer = DeepAnalyzer(analyzer, arch_analyzer, profiler, sys_diag)

    # Display configuration summary
    print_section("Configuration Summary")
    summary = analyzer.summary()
    for k, v in summary.items():
        print(f"{k}: {v}")

    # Display detailed hyperparameter recommendations
    print_section("Hyperparameter Recommendations")
    recommendations = analyzer.check_hyperparams()
    for r in recommendations:
        print(r)

    # Show automatic adjustments suggestions
    print_section("Proposed Automatic Adjustments")
    adjustments = analyzer.auto_adjust()
    for k, v in adjustments.items():
        print(f"{k}: {v}")

    # Log the start of the complete analysis
    logger = TrainLogger(log_file="logs/trainsense.log")
    logger.log_info("Starting complete and detailed analysis.")

    # Suggest an optimizer based on model complexity
    opt_adv = OptimizerHelper.suggest_optimizer(arch_info.get("total_parameters", 0), arch_info.get("layer_count", 0))
    print_section("Basic Optimizer Recommendation")
    print("Recommended Optimizer:", opt_adv)
    logger.log_info(f"Suggested Optimizer: {opt_adv}")

    # Compute ultra-optimized hyperparameters
    ultra_params = ultra_opt.compute_optimal_hyperparams()
    print_section("Ultra Optimized Hyperparameters")
    for k, v in ultra_params.items():
        print(f"{k}: {v}")

    # Display GPU status (if available)
    try:
        gpu_monitor = GPUMonitor()
        gpu_status = gpu_monitor.get_gpu_status()
        print_section("GPU Status")
        for gpu in gpu_status:
            print(gpu)
    except ImportError:
        print("GPUtil not installed. GPU status unavailable.")

    # Generate a comprehensive deep analysis report
    report = deep_analyzer.comprehensive_report()
    print_section("Comprehensive Deep Analysis Report")
    for key, value in report.items():
        print(f"{key}: {value}")

    # Adjust learning rate based on performance throughput
    new_lr, tune_msg = OptimizerHelper.adjust_learning_rate(learning_rate, report["profiling"]["throughput"])
    print_section("Learning Rate Adjustment Based on Performance")
    print("New Learning Rate:", new_lr, "-", tune_msg)

if __name__ == "__main__":
    main()

Explanation

  1. Configuration and Diagnostics:
    The SystemConfig and SystemDiagnostics modules collect your hardware and system usage data, including details about GPUs, CUDA/cuDNN versions, and OS information.

  2. Model Architecture Analysis:
    The ArchitectureAnalyzer inspects your model to count parameters, layers, and detect specific layer types. It infers the overall architecture (e.g., CNN, LSTM) and provides tailored recommendations.

  3. Hyperparameter Analysis:
    TrainingAnalyzer uses system and architecture info to verify that your chosen batch size, learning rate, and epochs are appropriate. It offers detailed recommendations and can automatically suggest adjustments.

  4. Performance Profiling:
    The ModelProfiler benchmarks your model by measuring inference speed and throughput. This helps in fine-tuning hyperparameters further.

  5. Advanced Optimization:
    OptimizerHelper and UltraOptimizer provide further recommendations on optimizer choice and compute optimal hyperparameters based on your model and system stats.

  6. Deep Analysis Report:
    DeepAnalyzer compiles all the above information into a comprehensive report with overall recommendations and performance metrics.

  7. Logging:
    The TrainLogger writes detailed, timestamped logs of each step, which is useful for debugging and tracking changes.

Integration

To integrate TrainSense into your project, simply install it via PyPI or in development mode. Then import the modules you need and incorporate them into your training pipeline. Use the example above as a guide to generate reports, adjust your hyperparameters, and monitor your system’s performance throughout model training.

Contributing

Contributions are welcome! Please fork the repository, create your feature branch, commit your changes, and submit a pull request.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

trainsense-0.0.1.tar.gz (12.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

trainsense-0.0.1-py3-none-any.whl (12.7 kB view details)

Uploaded Python 3

File details

Details for the file trainsense-0.0.1.tar.gz.

File metadata

  • Download URL: trainsense-0.0.1.tar.gz
  • Upload date:
  • Size: 12.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.4

File hashes

Hashes for trainsense-0.0.1.tar.gz
Algorithm Hash digest
SHA256 f9842d6fcdf69aacd74f3d2fb6b09ca50d2e1aa790338f03371177a3eb3b9d68
MD5 0d81fd1131e5d2cf2cc33ab4bffbd616
BLAKE2b-256 73c9785cc5a730d7390bb13897c4672856c58338da0c6f2f4786a8a86f516ee8

See more details on using hashes here.

File details

Details for the file trainsense-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: trainsense-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 12.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.4

File hashes

Hashes for trainsense-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f0026b34bc408ba76ca989f5e9d25e7be55316090522d6689d1b6fd9b5490351
MD5 784d6671e71716630a19e788c14d8e91
BLAKE2b-256 89b42c66c6b06c757e51cd3119278bc64e6bb3b85c8324644347d657db7ceb00

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page