NVIDIA GPU examples

Single GPU example for Hopper, Blackwell, or RTX Pro 6000 BSE Multi-gpu snippets for Blackwell and Hopper; Hopper PPCIE node label

These examples show how to request NVIDIA passthrough devices in pod specs.

They require that you have deployed CoCo following the NVIDIA Confidential Containers Reference Architecture which documents supported component versions and the passthrough modes used below.

In brief: NVIDIA Hopper, NVIDIA Blackwell, and NVIDIA RTX Pro 6000 all support Single-GPU passthrough (SPT). Hopper and Blackwell additionally support Multi-GPU passthrough (MPT). Protected PCIe (PPCIE) mode is unique to Hopper multi-gpu usages. The following sections are example pod fragments aligned to each case.

1. Hopper, Blackwell, or RTX Pro 6000 BSE: single-GPU passthrough (SPT)

No nvidia.com/cc.mode label change is required under default Confidential Containers / GPU Operator settings (on).

Example pod requesting one GPU on Hopper, Blackwell, or RTX Pro 6000 BSE:

apiVersion: v1
kind: Pod
metadata:
  name: cuda-vectoradd-kata
  namespace: default
spec:
  runtimeClassName: kata-qemu-nvidia-gpu-tdx
  restartPolicy: Never
  containers:
  - name: cuda-vectoradd
    image: "nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda12.5.0-ubuntu22.04"
    resources:
      limits:
        nvidia.com/pgpu: "1"

2. Blackwell: multi-GPU passthrough (MPT)

Use the same pod as above, but change the resource section:

    resources:
      limits:
        nvidia.com/pgpu: "8"

3. Hopper: multi-GPU passthrough with Protected PCIe (PPCIE) and NVSwitch

On Hopper, multi-GPU confidential passthrough uses Protected PCIe: the pod must request both GPUs and NVSwitch devices, and the node must use ppcie confidential GPU mode.

kubectl label node <node-name> nvidia.com/cc.mode=ppcie --overwrite

Use the same pod as above, but change the resource section to include all node GPU resources along with their switch links to the pod:

    resources:
      limits:
        nvidia.com/pgpu: "8"
        nvidia.com/nvswitch: "4"

After changing nvidia.com/cc.mode, wait for GPU Operator operands to settle and confirm pods are healthy (kubectl get pods -A), as in the Kata QEMU GPU guide.

To reset for single GPU passthrough change the label back to on.

kubectl label node <node-name> nvidia.com/cc.mode=on --overwrite