Mismatch between CUDA 12.6 and drivers 535 on Ubuntu, but fail to upgrade to 565

I have issues with CUDA 12.6 to run simple Thrust programs (cudaErrorUnsupportedPtxVersion: the provided PTX was compiled with an unsupported toolchain.), but when updating the NVIDIA drivers from 535 to 565, I cannot use my GPU anymore (no CUDA-capable device is detected).

Details:

I installed Ubuntu 24.04 on a machine equipped with an NVIDIA GeForce RTX 3060 12Gb.
I let Ubuntu install the proprietary drivers, so I have the 535 NVIDIA driver.

nvidia-smi shows:

NVIDIA-SMI 535.183.01             
Driver Version: 535.183.01   
CUDA Version: 12.2

Then I installed CUDA using the Network Repo Installation described here: CUDA Installation Guide for Linux.

It worked and I now have CUDA 12.6.

nvcc -V shows:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Tue_Oct_29_23:50:19_PDT_2024
Cuda compilation tools, release 12.6, V12.6.85
Build cuda_12.6.r12.6/compiler.35059454_0

Problem 1: if I execute a simple thrust program I obtain the error :

terminate called after throwing an instance of 'thrust::THRUST_200500_520_NS::system::system_error'
  what():  parallel_for failed: cudaErrorUnsupportedPtxVersion: the provided PTX was compiled with an unsupported toolchain.

which seems to indicate a mismatch between CUDA and the drivers (from what I see on the Internet).
Although this documentation states that CUDA 12.x need at least the driver 525, so there is no mismatch?
https://docs.nvidia.com/deploy/cuda-compatibility/#cuda-11-and-later-defaults-to-minor-version-compatibility

Anyway, I decided to upgrade my NVIDIA drivers to the latest 665.

Problem 2, nvidia-smi now shows:

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

and executing my simple Thrust program gives:

terminate called after throwing an instance of 'thrust::THRUST_200500_520_NS::system::detail::bad_alloc'
  what():  std::bad_alloc: cudaErrorNoDevice: no CUDA-capable device is detected

I tried to upgrade the drivers using the GUI Additional Drivers, and also the command line with apt.
I also tried other drivers like 560, 555, and 550, but I got the same issue.

What should I do to have a working version of CUDA with the appropriate drivers?

Since Nvidia-smi is complaining it would seem you don’t have a working driver. Are you running kernel 6.12.n by any chance?

No, I am running the 6.8.0-41.

I solved my issue by reinstalling Ubuntu from scratch, without letting Ubuntu install the drivers, and doing the other way around:

  • I first installed the driver 565 using Network Repo Installation method
  • and then I installed the cuda toolkit, also using the Network Repo Installation method

I now have driver 565 + Cuda 12.6.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.