I have integrated in a system following NVIDIA devices:
- NVIDIA Geforce 9800 GT Graphic Card
- Tesla K40m
- Tesla M2050
I can see in a Device Manager from Win 10 that all 3 devices are properly initialized and 9800 GT is working according to expectation as a Graphic Card (Driver Date: 17/08/2015, Driver Version 9.18.13.4181). For Both Teslas I can see that the driver is the same - Driver Date: 31/01/2018 and Driver Version: 23.21.13.9085.
I would like to test some CUDA Samples v10.1 like 1_Utilities → deviceQuery and deviceQueryDrv. For
running deviceQuery I get following printout (error):
C:\ProgramData\NVIDIA Corporation\CUDA Samples\v10.1\1_Utilities\deviceQuery../…/bin/win64/Debug/deviceQuery.exe Starting…
CUDA Device Query (Runtime API) version (CUDART static linking)
cudaGetDeviceCount returned 35
→ CUDA driver version is insufficient for CUDA runtime version
Result = FAIL
C:\ProgramData\NVIDIA Corporation\CUDA Samples\v10.1\1_Utilities\deviceQuery../…/bin/win64/Debug/deviceQuery.exe (process 7280) exited with code 1.
To automatically close the console when debugging stops, enable Tools->Options->Debugging->Automatically close the console when debugging stops.
Press any key to close this window . . .
When I start deviceQueryDrv I get the following printout:
C:\ProgramData\NVIDIA Corporation\CUDA Samples\v10.1\1_Utilities\deviceQueryDrv../…/bin/win64/Debug/deviceQueryDrv.exe Starting…
CUDA Device Query (Driver API) statically linked version
Detected 3 CUDA Capable device(s)
Device 0: “Tesla K40m”
CUDA Driver Version: 6.5
CUDA Capability Major/Minor version number: 3.5
Total amount of global memory: 11520 MBytes (12079398912 bytes)
(15) Multiprocessors, (192) CUDA Cores/MP: 2880 CUDA Cores
GPU Max Clock rate: 745 MHz (0.75 GHz)
Memory Clock rate: 3004 Mhz
Memory Bus Width: 384-bit
L2 Cache Size: 1572864 bytes
Max Texture Dimension Sizes 1D=(65536) 2D=(65536, 65536) 3D=(4096, 4096, 4096)
Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Texture alignment: 512 bytes
Maximum memory pitch: 2147483647 bytes
Concurrent copy and kernel execution: Yes with 2 copy engine(s)
Run time limit on kernels: No
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Concurrent kernel execution: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Enabled
CUDA Device Driver Mode (TCC or WDDM): TCC (Tesla Compute Cluster Driver)
Device supports Unified Addressing (UVA): Yes
cuDeviceGetAttribute returned 1
→ CUDA_ERROR_INVALID_VALUE
C:\ProgramData\NVIDIA Corporation\CUDA Samples\v10.1\1_Utilities\deviceQueryDrv../…/bin/win64/Debug/deviceQueryDrv.exe (process 7904) exited with code 0.
To automatically close the console when debugging stops, enable Tools->Options->Debugging->Automatically close the console when debugging stops.
Press any key to close this window . . .
My question: What to do to make the CUDA samples work? Which drivers to uninstal/install? Probably this is a compatibility issue.
Best regards
Simon