I am using the 22.08 image with Apptainer. We are running on RHEL 7.9 and apptainer version 1.2.3-1.el7 and we are running 4 TU104GL [Tesla T4] cards.
Running inside the container:
/opt/tritonserver/bin/tritonserver --model-repository=/triton.repos.d --disable-auto-complete-config --repository-poll-secs=120
We previously had the --strict-model-config=false option, but it complained, and I change it to the --disable-auto-complete.
The runtime driver is
Apptainer> cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module 515.86.01 Wed Oct 26 09:12:38 UTC 2022
GCC version: gcc version 9.3.1 20200408 (Red Hat 9.3.1-2) (GCC)
Kernal Module is
Apptainer> cat /sys/module/nvidia/version
515.86.01
NVCC version
Apptainer> nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Jun__8_16:49:14_PDT_2022
Cuda compilation tools, release 11.7, V11.7.99
Build cuda_11.7.r11.7/compiler.31442593_0
I am not sure if the V11.7.99 refers to the 11.7 update 1 Preview (good for 22.07/22.06) or the 11.7.0 (good for 22.05).
The Framework Support Matrix states:
‘Release 22.08 is based on CUDA 11.7.1, which requires NVIDIA Driver release 515 or later. However, if you are running on a data center GPU (for example, T4 or any other data center GPU), you can use NVIDIA driver release 450.51 (or later R450), 470.57 (or later R470), or 510.47 (or later R510).’
I have tried using 22.05, 22.06, 22.07 and 22.08 and they all give the same error.