On 5060ti,Ncu does not detect kernels, ==ERROR== The application returned an error code (11)

Tested Nvidia tools on a 5060ti machine, but ncu does not work. The code used comes from NVIDIA GitHub /cuda-samples/Samples/0_Introduction/vectorAddDrv.

../cuda-samples/Samples/0_Introduction/vectorAddDrv# ls
Makefile   data          vectorAddDrv.cpp  vectorAddDrv_vs2017.sln      vectorAddDrv_vs2019.sln      vectorAddDrv_vs2022.sln      vectorAdd_kernel.cu
README.md  vectorAddDrv  vectorAddDrv.o    vectorAddDrv_vs2017.vcxproj  vectorAddDrv_vs2019.vcxproj  vectorAddDrv_vs2022.vcxproj  vectorAdd_kernel64.fatbin
../cuda-samples/Samples/0_Introduction/vectorAddDrv# ncu vectorAddDrv
Vector Addition (Driver API)
==PROF== Connected to process 9499 (../cuda-samples/Samples/0_Introduction/vectorAddDrv/vectorAddDrv)
MapSMtoCores for SM 12.0 is undefined.  Default to use 128 Cores/SM
> Using CUDA Device [0]: NVIDIA GeForce RTX 5060 Ti
==ERROR== The application returned an error code (11).
==WARNING== No kernels were profiled.

And I tried several other examples in the samples, and they also didn’t work with same error &warning.

(on windows wsl2) ubuntu version:

../cuda-samples/Samples/0_Introduction/vectorAddDrv#lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.5 LTS
Release:        22.04
Codename:       jammy

ncu version:

../cuda-samples/Samples/0_Introduction/vectorAddDrv# ncu --version
NVIDIA (R) Nsight Compute Command Line Profiler
Copyright (c) 2018-2024 NVIDIA Corporation
Version 2024.1.1.0 (build 33998838) (public-release)

nvcc version:

../cuda-samples/Samples/0_Introduction/vectorAddDrv# nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Thu_Mar_28_02:18:24_PDT_2024
Cuda compilation tools, release 12.4, V12.4.131
Build cuda_12.4.r12.4/compiler.34097967_0

ncu 2024.1.1 does not support Blackwell GPUs. 2024.4 is the first one with Blackwell support (GB100). 2025.3 is the first one to support GB20x. You can use --list-chips to see the list of supported chips.

I changed the cuda toolkit version to 12.9 and tested the sample i posted, it works now. Thanks!