Skip to main content

A package to analyze architecture and optimize model training

Project description

TrainSense

TrainSense is an ultra robust Python package for deep analysis of deep learning model architectures, advanced hyperparameter optimization, and comprehensive system diagnostics. It can automatically detect various layer types (e.g., LSTM, RNN, CNN, Transformer, GPT, etc.), provide detailed hyperparameter recommendations, profile model performance, and retrieve system information (including CUDA and cuDNN versions).

Features

  • Architecture Analysis
    Detects the number of parameters, counts layers, identifies layer types, and infers the overall architecture (e.g., CNN, LSTM, Transformer).

  • Hyperparameter Evaluation & Optimization
    Checks and recommends adjustments for batch size, learning rate, and epochs based on your system configuration and model complexity. It even provides automatic adjustments.

  • Model Profiling
    Benchmarks the model by measuring average inference time, throughput, and (if applicable) GPU memory usage.

  • System Diagnostics
    Retrieves detailed system information including CPU count, total memory, GPU details, CUDA version, cuDNN version, OS information, and real-time usage statistics.

  • Enhanced Logging
    Provides enriched, timestamped logs of all analysis steps for better traceability and debugging.

Installation

Prerequisites

  • Python 3.7 or newer
  • PyTorch (install via pip or conda)
  • psutil
  • GPUtil (optional, for GPU monitoring)

Install via PyPI

pip install TrainSense

Install in Development Mode

Clone the repository, navigate to the root directory (which contains setup.py), and run:

pip install -e .

How It Works

TrainSense is composed of several modules that work together to provide a complete analysis of your model and system:

  1. System Configuration & Diagnostics

    • SystemConfig: Retrieves hardware details such as CPU count, total memory, GPU info, CUDA and cuDNN versions, and OS details.
    • SystemDiagnostics: Provides real-time usage statistics like CPU usage, memory usage, disk usage, and uptime.
  2. Architecture Analysis

    • ArchitectureAnalyzer: Inspects your model to count parameters, layers, and detect layer types. It also infers the model architecture (e.g., CNN, LSTM) and provides recommendations based on complexity.
  3. Hyperparameter Analysis

    • TrainingAnalyzer: Evaluates your hyperparameters (batch size, learning rate, epochs) in light of your system and model architecture. It provides detailed recommendations and can automatically suggest adjustments.
  4. Model Profiling & Optimization

    • ModelProfiler: Benchmarks the model to measure average inference time and throughput.
    • OptimizerHelper & UltraOptimizer: Offer suggestions on which optimizer to use and compute optimal hyperparameters based on your training data size, model complexity, and system resources.
  5. Deep Analysis

    • DeepAnalyzer: Combines results from all modules to generate a comprehensive report with overall recommendations and key performance metrics.
  6. Logging

    • TrainLogger: Captures detailed logs with timestamps, making it easier to trace and debug each step of the analysis.

Usage Example

Below is a complete example demonstrating how to integrate TrainSense into your deep learning workflow.

import torch
import torch.nn as nn
from TrainSense.system_config import SystemConfig
from TrainSense.system_diagnostics import SystemDiagnostics
from TrainSense.analyzer import TrainingAnalyzer
from TrainSense.arch_analyzer import ArchitectureAnalyzer
from TrainSense.deep_analyzer import DeepAnalyzer
from TrainSense.logger import TrainLogger
from TrainSense.model_profiler import ModelProfiler
from TrainSense.optimizer import OptimizerHelper
from TrainSense.ultra_optimizer import UltraOptimizer
from TrainSense.gpu_monitor import GPUMonitor
from TrainSense.utils import print_section

def main():
    # Retrieve system configuration and diagnostics
    sys_config = SystemConfig()
    sys_diag = SystemDiagnostics()

    # Define initial hyperparameters
    batch_size = 64
    learning_rate = 0.05
    epochs = 30

    # Create a sample CNN model (for image classification)
    model = nn.Sequential(
        nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1),
        nn.ReLU(),
        nn.MaxPool2d(2),
        nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1),
        nn.ReLU(),
        nn.MaxPool2d(2),
        nn.Flatten(),
        nn.Linear(128 * 8 * 8, 10)
    )

    # Analyze the model architecture
    arch_analyzer = ArchitectureAnalyzer(model)
    arch_info = arch_analyzer.analyze()

    # Analyze the hyperparameters based on system config and architecture
    analyzer = TrainingAnalyzer(batch_size, learning_rate, epochs, system_config=sys_config, arch_info=arch_info)

    # Profile the model performance
    profiler = ModelProfiler(model, device="cpu")

    # Ultra-optimize hyperparameters based on training data stats and system resources
    ultra_opt = UltraOptimizer({"data_size": 2000000}, arch_info, {"total_memory_gb": sys_config.total_memory})

    # Combine all analyses into a deep report
    deep_analyzer = DeepAnalyzer(analyzer, arch_analyzer, profiler, sys_diag)

    # Display configuration summary
    print_section("Configuration Summary")
    summary = analyzer.summary()
    for k, v in summary.items():
        print(f"{k}: {v}")

    # Display detailed hyperparameter recommendations
    print_section("Hyperparameter Recommendations")
    recommendations = analyzer.check_hyperparams()
    for r in recommendations:
        print(r)

    # Show automatic adjustments suggestions
    print_section("Proposed Automatic Adjustments")
    adjustments = analyzer.auto_adjust()
    for k, v in adjustments.items():
        print(f"{k}: {v}")

    # Log the start of the complete analysis
    logger = TrainLogger(log_file="logs/trainsense.log")
    logger.log_info("Starting complete and detailed analysis.")

    # Suggest an optimizer based on model complexity
    opt_adv = OptimizerHelper.suggest_optimizer(arch_info.get("total_parameters", 0), arch_info.get("layer_count", 0))
    print_section("Basic Optimizer Recommendation")
    print("Recommended Optimizer:", opt_adv)
    logger.log_info(f"Suggested Optimizer: {opt_adv}")

    # Compute ultra-optimized hyperparameters
    ultra_params = ultra_opt.compute_optimal_hyperparams()
    print_section("Ultra Optimized Hyperparameters")
    for k, v in ultra_params.items():
        print(f"{k}: {v}")

    # Display GPU status (if available)
    try:
        gpu_monitor = GPUMonitor()
        gpu_status = gpu_monitor.get_gpu_status()
        print_section("GPU Status")
        for gpu in gpu_status:
            print(gpu)
    except ImportError:
        print("GPUtil not installed. GPU status unavailable.")

    # Generate a comprehensive deep analysis report
    report = deep_analyzer.comprehensive_report()
    print_section("Comprehensive Deep Analysis Report")
    for key, value in report.items():
        print(f"{key}: {value}")

    # Adjust learning rate based on performance throughput
    new_lr, tune_msg = OptimizerHelper.adjust_learning_rate(learning_rate, report["profiling"]["throughput"])
    print_section("Learning Rate Adjustment Based on Performance")
    print("New Learning Rate:", new_lr, "-", tune_msg)

if __name__ == "__main__":
    main()

Explanation

  1. Configuration and Diagnostics:
    The SystemConfig and SystemDiagnostics modules collect your hardware and system usage data, including details about GPUs, CUDA/cuDNN versions, and OS information.

  2. Model Architecture Analysis:
    The ArchitectureAnalyzer inspects your model to count parameters, layers, and detect specific layer types. It infers the overall architecture (e.g., CNN, LSTM) and provides tailored recommendations.

  3. Hyperparameter Analysis:
    TrainingAnalyzer uses system and architecture info to verify that your chosen batch size, learning rate, and epochs are appropriate. It offers detailed recommendations and can automatically suggest adjustments.

  4. Performance Profiling:
    The ModelProfiler benchmarks your model by measuring inference speed and throughput. This helps in fine-tuning hyperparameters further.

  5. Advanced Optimization:
    OptimizerHelper and UltraOptimizer provide further recommendations on optimizer choice and compute optimal hyperparameters based on your model and system stats.

  6. Deep Analysis Report:
    DeepAnalyzer compiles all the above information into a comprehensive report with overall recommendations and performance metrics.

  7. Logging:
    The TrainLogger writes detailed, timestamped logs of each step, which is useful for debugging and tracking changes.

Integration

To integrate TrainSense into your project, simply install it via PyPI or in development mode. Then import the modules you need and incorporate them into your training pipeline. Use the example above as a guide to generate reports, adjust your hyperparameters, and monitor your system’s performance throughout model training.

Contributing

Contributions are welcome! Please fork the repository, create your feature branch, commit your changes, and submit a pull request.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

trainsense-0.0.2.tar.gz (12.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

trainsense-0.0.2-py3-none-any.whl (12.7 kB view details)

Uploaded Python 3

File details

Details for the file trainsense-0.0.2.tar.gz.

File metadata

  • Download URL: trainsense-0.0.2.tar.gz
  • Upload date:
  • Size: 12.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.4

File hashes

Hashes for trainsense-0.0.2.tar.gz
Algorithm Hash digest
SHA256 0a7235ae48c685ed8f35aac622aa214a94baaa5e0444dcf0e97504a0e710e27c
MD5 e7b3197ebbe2fd3bbc64ff512550bf08
BLAKE2b-256 19e16fcd1666a1dfd38658145eef423e8600fc9c2b49095154d5c76cb7a51a68

See more details on using hashes here.

File details

Details for the file trainsense-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: trainsense-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 12.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.4

File hashes

Hashes for trainsense-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 c51428d1b0be902e0d0ca2e0dcb7db2c561e5f93e55e7f968ecb48a637c33e91
MD5 ab3ada2076506422957bebe98e5f2eb5
BLAKE2b-256 a9c72987033dde68074f3fde809467ccd2daef51fb2c4c64f555c8060a39ecba

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page