Error running MLPerf benchmark with CUDA cuInit failed: initialization error

I’m trying to run the MLPerf benchmark with the following command:

sudo make run RUN_ARGS="--benchmarks=bert --scenarios=offline --config_ver=default,high_accuracy,triton,high_accuracy_triton"

However, I’m encountering the following error:

pycuda._driver.LogicError: cuInit failed: initialization error

Environment:

  • NVIDIA-SMI: 550.54.15
  • CUDA Version: 12.4
  • GPU: NVIDIA A100 80GB PCIe
  • CUDA compiler version: 12.0

Additional information:

  • I have verified that the NVIDIA driver and CUDA are properly installed.
  • The GPU is detected correctly by nvidia-smi.
  • nvcc --version shows the CUDA compiler version.

I’m unsure why cuInit is failing. Any insights or suggestions would be greatly appreciated. Thank you!

You have MIG mode enabled but no MIG devices defined. That results in a GPU that is unusable.