Display card: TU102 [TITAN RTX]
OS: RHEL 9.4
nvidia-smi output: NVIDIA-SMI 550.90.12 Driver Version: 550.90.12 CUDA Version: 12.4
GPU Operator: v24.6.1
If run a container that has CUDA v12.4 it works fine.
If run a container that has CUDA v12.5 or v12.6 I get CUDA_ERROR_COMPAT_NOT_SUPPORTED_ON_DEVICE error.
If I configure the node to have CUDA v12.6 then, containers with CUDA v12.6 work and not any other version.
A colleague of mine runs plain Docker and can run different versions of the container at the same time without issues.
So, how can I get over the limitation while using GPU operator and K8S? What am I doing wrong?
thanks
Ranga