Protected PCIE Verifier
Project description
Protected PCIE Verifier
Overview
In a multi-GPU confidential computing (CC) setup, NVLink interconnects and NVSwitches are used for GPU to GPU data traffic. NVLink interconnects and NVSwitches are outside the trust boundary and thus should not allow access to plain-text data. All data that flows over NVLink must be encrypted prior to transfer and decrypted at the destination GPU. On the GPU encryption and decryption is performed by the GPU copy engine (CE).
Bouncing through a CE adds constraints and latency to the data path which may result in performance drops for some workloads. To minimize performance impact, NVIDIA's 'PPCIE' mode adjusts the security model to trust NVLink data, enabling plain-text traffic without CEs while preserving a Confidential Virtual Machine.
Note: There are only two supported GPU usage configurations: ALL GPUs are in CC mode. Each GPU can be assigned to one Confidential VM. In this scenario, use the CC verifier. ALL GPUs are in PPCIe mode. All GPUs must be assigned one Confidential VM. In this scenario, use the PPCIE verifier
High-Level Architecture Diagram
The PPCIE verifier is a tool designed to verify the security of the multi-GPU system by attesting to the integrity of its GPUs and NVSwitches. The attestation SDK is used to gather evidence for each device, with further attestation performed either locally or remotely, as specified by the user when running the PPCIE Verifier tool.
After collecting attestation results for each device, the PPCIE verifier validates these results against a policy file to confirm that all claims are legitimate. Following the attestation process, the tool conducts a final topology check to verify that the devices are securely connected to the expected configuration. The final attestation results are then presented to the user, detailing the checks performed.
Detailed Architecture Flow
- The PPCIE Verifier tool is initiated by the user, who specifies the attestation mode for both GPUs and NvSwitches.
- The system components are enumerated (number of GPUs and NvSwitches).
- Pre-checks are performed on each GPU to ensure it is configured for confidential computing.
- Pre-checks are performed on each NvSwitch to ensure it is configured for confidential computing.
- The required GPU evidence for attestation is collected from the Attestation SDK for each GPU.
- Once the evidence is collected, the PPCIE Verifier tool initiates attestation verification based on the mode specified by the user.
- GPU attestation is initiated by the Attestation SDK: the local-gpu-verifier is used for local attestation, while NRAS (NVIDIA's Remote Attestation Service) is used for remote attestation.
- The Attestation SDK provides GPU attestation results to the PPCIE Verifier.
- If the GPU attestation is successful, the PPCIE Verifier proceeds to collect evidence for the NvSwitches from the Attestation SDK.
- Once all NvSwitch evidence is collected, attestation is initiated by the PPCIE Verifier.
- NvSwitch attestation is performed by the Attestation SDK: the local-switch-verifier is used for local attestation, while NRAS is used for remote attestation.
- The Attestation SDK provides NvSwitch attestation results to the PPCIE Verifier.
- If the NvSwitch attestation is successful, the PPCIE Verifier performs a topology check to ensure the devices are securely connected in the expected configuration.
- The PPCIE Verifier determines the overall results and updates the status for each check it performs.
- The GPU ready state is set.
- The final attestation results are presented to the user, detailing the checks performed and the status of each device in the system.
Getting started
Prerequisites
HGX system with 8 GPUs and 4 switches assigned to the single tenant
python >= 3.9
git installed
Nvidia GPU driver installed
Nvidia Switch driver installed
Nvidia Fabric Manager installed
Installation/Dependencies
PPCIE Verifier has the following dependencies:
- nv-attestation-sdk (Attestation SDK)
- nv-local-gpu-verifier (Local GPU Verifier)
- nv-switch-verifier (Local Switch Verifier) Note: nv-switch-verifier (Local Switch Verifier) This is a module inside attestation-sdk and does not require separate installation
Installation Instructions:
Please elevate to Root User Privileges before installing the packages: (Note: This is necessary to set the GPU ready state)
sudo -i
Method 1: Using installer script
1. git clone https://github.com/NVIDIA/nvtrust/tree/main
2. cd nvtrust/guest_tools/ppcie-verifier/install
3. source ppcie-installer.sh (This would install the required dependencies)
Method 2: Using PyPI (Requires python virtual environment creation)
1. python3 -m venv venv
2. source venv/bin/activate
3. pip3 install nv-ppcie-verifier (This would automatically install nv-attestation-sdk, nv-local-gpu-verifier and nv-switch-verifier)
Usage
python3 -m ppcie.verifier.verification --gpu-attestation-mode=LOCAL --switch-attestation-mode=LOCAL (Example arguments provided)
Options
| Option | Description | Value Options |
|---|---|---|
--gpu-attestation-mode |
Type of GPU Attestation | LOCAL, REMOTE |
--switch-attestation-mode |
Type of nvSwitch Attestation | LOCAL, REMOTE |
--log |
Configure log level | DEBUG, INFO, WARNING, ERROR, TRACE, CRITICAL |
--allow-hold-cert |
Enable attestation when OCSP status of certificate is cert hold | N/A |
--rim-url RIM_SERVICE_URL |
The URL to be used for fetching driver and VBIOS RIM files (e.g., https://rim.nvidia.com/rims/) |
|
--ocsp-url OCSP_SERVICE_URL |
The URL to be used for checking the revocation status of a certificate (e.g., https://ocsp.ndis.nvidia.com/) |
|
--ocsp-nonce-disabled |
Flag which indicates whether to include a nonce when calling OCSP. Only applicable for local GPU attestation. False by default | |
--service-key |
Service key which is used to auth remote service calls to attestation services. None by default. Note: No valid service keys have been created by admins yet - using any key will result in attestation failure. | |
--claims-version |
Specify the claims version to retrieve version-specific attestation claims (e.g., 2.0). Please refer to the Attestation Troubleshooting documentation for the claims. If the claims version is not set, it defaults to 2.0. | "2.0" or "3.0" |
Troubleshooting
Below are some of the common issues that have been encountered:
Installation Issues:
-
ModuleNotFoundError: No module named 'nv_attestation_sdk'while installing the packages using the installer script(ppcie-installer.sh)Solution: Delete the venv created and try installing the packages using the script again
-
If you encounter warning and installation issues similar to the below while installing the package:
WARNING: Ignoring invalid distribution ~v-attestation-sdk <site-package-directory>Please execute the following commands to clean up packages that were not installed properly and then re-try the installation:Solution:
rm -rf $(ls -l <site-packages-directory> | grep '~' | awk '{print $9}')
Configuration Issues
-
The nvmlInit call timed out.orError in Initializing NVML library. Please install the drivers again and re-trySolution: This requires re-installing the Nvidia GPU driver and fabric manager
-
NSCQWarning: NSCQ_RC_WARNING_RDT_INIT_FAILURESolution: This requires installing the correct version of the Nvidia Switch driver compatible with the GPU driver
License
This repository is licensed under Apache License v2.0 except where otherwise noted.
Support
For issues or questions, please file a bug. For additional support, contact us at attestation-support@nvidia.com
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nv_ppcie_verifier-1.6.4-py3-none-any.whl.
File metadata
- Download URL: nv_ppcie_verifier-1.6.4-py3-none-any.whl
- Upload date:
- Size: 46.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e87a191cef8a7f9f43725e070eac4da7b8b5033779b149af199c8782d688379e
|
|
| MD5 |
7efe680f2a7156308354981e2026d412
|
|
| BLAKE2b-256 |
a95e5a987b362c49e165cb717599696abf120d4b1c59b7950b970cfa1991944d
|