Skip to main content

GPU and TPU compatibility toolkit for AI frameworks

Project description

ai-compat

AI GPU and TPU compatibility toolkit that inspects, tests, and auto-fixes CUDA/driver mismatches and TPU configurations for major AI frameworks.

Features

GPU Support

  • GPU + CUDA detection (nvidia-smi, CUDA paths, cuDNN)
  • Framework scanner for PyTorch, TensorFlow, ONNX Runtime, diffusers, transformers
  • Compatibility checker with JSON rules
  • Auto-fix suggestions + optional pip installs
  • GPU diagnostics (PyTorch/TensorFlow/ONNX/VRAM tests)

TPU Support

  • Cloud TPU detection via gcloud CLI
  • Edge TPU detection (USB/PCIe devices, pycoral)
  • TensorFlow TPU compatibility checking
  • TPU diagnostics (Cloud TPU, Edge TPU, TensorFlow TPU)
  • Auto-fix suggestions for TPU setup

System Resources

  • RAM detection: Total and available system memory
  • Disk usage: Total, used, and free disk space
  • Automatic resource monitoring for AI workload planning
  • Uses psutil when available (falls back to system calls)

General

  • Environment file exporter (gpu-env.txt or tpu-env.txt)
  • CLI entry point: ai-compat
  • Works with both GPU and TPU simultaneously

Quickstart

pip install ai-compat
ai-compat scan          # Scan system (GPU, TPU, RAM, disk)
ai-compat check         # Check compatibility issues
ai-compat fix --apply   # Auto-fix issues
ai-compat test          # Run all tests (GPU + TPU)
ai-compat test --gpu-only  # Run only GPU tests
ai-compat test --tpu-only  # Run only TPU tests
ai-compat export --output env.txt

System Resources (RAM & Disk Usage)

View System Resources

The scan command automatically includes RAM and disk information:

ai-compat scan

To extract just the resources section:

# On Linux/macOS
ai-compat scan | grep -A 6 '"resources"'

# Or use jq (if installed)
ai-compat scan | jq '.resources'

RAM Memory Usage

Check your system's RAM capacity and availability:

$ ai-compat scan | jq '.resources'
{
  "ram_total_gb": 32.0,      # Total system RAM
  "ram_available_gb": 24.5,  # Available RAM for use
  "disk_total_gb": 500.0,    # Total disk space
  "disk_used_gb": 150.0,     # Used disk space
  "disk_free_gb": 350.0      # Free disk space
}

Python API for RAM Usage

from ai_compat import scan_system

snapshot = scan_system()
resources = snapshot.resources

print(f"Total RAM: {resources.ram_total_gb} GB")
print(f"Available RAM: {resources.ram_available_gb} GB")
print(f"RAM Usage: {((resources.ram_total_gb - resources.ram_available_gb) / resources.ram_total_gb * 100):.1f}%")
print(f"Free Disk: {resources.disk_free_gb} GB")

Use Cases

  • Model Loading: Check if you have enough RAM before loading large models
  • Batch Size Planning: Determine optimal batch sizes based on available memory
  • Disk Space: Verify sufficient space for model downloads and checkpoints
  • Resource Monitoring: Track system resources in CI/CD pipelines

Example Output

ai-compat check
{
  "issues": [
    {
      "framework": "PyTorch",
      "message": "PyTorch 2.2.1 requires CUDA ['12.1', '12.2'] but system has 11.8",
      "severity": "error",
      "suggestion": "Install CUDA 12.1/12.2 or install PyTorch wheel matching CUDA 11.8"
    }
  ],
  "summary": "Detected 1 issue(s)",
  "metadata": {
    "gpu_count": 1,
    "cuda_version": "11.8",
    "driver_version": "535.104"
  }
}

Architecture

ai_compat/
  cli.py        # command-line interface
  scanner.py    # system + framework inspection
  gpu.py        # low-level GPU detection
  tpu.py        # TPU detection (Cloud + Edge)
  checker.py    # rules-based compatibility engine
  fixer.py      # auto-fix planner
  tester.py     # GPU + TPU diagnostics
  exporter.py   # environment generator
  rules/
    cuda_rules.json
    pytorch_rules.json
    tensorflow_rules.json
    tpu_rules.json

TPU Detection

Cloud TPU

  • Requires gcloud CLI installed and configured
  • Detects TPU via gcloud compute tpus list
  • Checks connectivity and TensorFlow TPUClusterResolver access

Edge TPU

  • Detects USB/PCIe Edge TPU devices
  • Checks for /dev/apex_0 device
  • Requires pycoral for full functionality

Limitations

  • Requires nvidia-smi for NVIDIA GPU detection
  • Cloud TPU detection requires gcloud CLI
  • Edge TPU detection requires pycoral for full functionality
  • System resource snapshot uses psutil when available (falls back to /proc/sysconf)
  • Auto-fix commands run via pip; --apply executes them (use with caution)
  • VRAM stress test relies on PyTorch
  • Rules JSON provides conservative reference mappings; update as needed

Example: Full System Scan (GPU + TPU + RAM + Disk)

$ ai-compat scan
{
  "platform": "Linux 5.15.0",
  "python_version": "3.10.12",
  "resources": {
    "ram_total_gb": 32.0,
    "ram_available_gb": 24.5,
    "disk_total_gb": 500.0,
    "disk_used_gb": 150.0,
    "disk_free_gb": 350.0
  },
  "gpu": {
    "gpu_count": 1,
    "gpus": [{"name": "NVIDIA RTX 4090", "memory_total_gb": 24.0}],
    "cuda": {"version": "12.1", "cudnn_version": "8.9"}
  },
  "tpu": {
    "tpu_count": 1,
    "has_cloud_tpu": true,
    "has_edge_tpu": false,
    "cloud_tpu_available": true,
    "tpus": [{"type": "cloud", "accelerator_type": "v2-8"}]
  },
  "frameworks": {
    "tensorflow": {
      "version": "2.16.0",
      "gpu_available": true,
      "tpu_available": true
    }
  }
}

Quick RAM Check Command

For a quick RAM check, you can use:

# View only RAM information
ai-compat scan | jq '.resources | {ram_total_gb, ram_available_gb, ram_usage_percent: ((.ram_total_gb - .ram_available_gb) / .ram_total_gb * 100)}'

# Or on systems without jq
ai-compat scan | python3 -c "import sys, json; d=json.load(sys.stdin); r=d['resources']; print(f\"RAM: {r['ram_available_gb']:.1f}GB / {r['ram_total_gb']:.1f}GB available\")"

Contributions welcome!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_compat-0.3.1.tar.gz (16.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_compat-0.3.1-py3-none-any.whl (18.0 kB view details)

Uploaded Python 3

File details

Details for the file ai_compat-0.3.1.tar.gz.

File metadata

  • Download URL: ai_compat-0.3.1.tar.gz
  • Upload date:
  • Size: 16.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for ai_compat-0.3.1.tar.gz
Algorithm Hash digest
SHA256 67d7485e052364354781c8d1762aedc6c1f64b6c57ca6535f245574dcc24e986
MD5 c5f8630417cc49714c296a679d57ea5a
BLAKE2b-256 2fa893797aa5c8fa8cbcd762cd5846673ffebc49faccdf8be7a19b539c1fcb89

See more details on using hashes here.

File details

Details for the file ai_compat-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: ai_compat-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 18.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for ai_compat-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f835032bb210e7d3b501795fc53fb73308df94e7f2ff5e2e4c703ccb1dc39805
MD5 f290387cdf4670d9abbc2651b109a1b4
BLAKE2b-256 8e6bd51aa3227eae23f62f8268280d0204ebe37e70b2cf84de066b8326047edb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page