I am evaluating the trial version of the CUDA Fortran compiler and trying to execute the matrix multiplication sample problem: matmul_drv.F90 and matmul1.cuf and after successful compilation, I get the error message:
Starting host calculation.
0: ALLOCATE: 1600 bytes requested; status = 35(CUDA driver version is insufficient for CUDA runtime version)
The array size is only 20 by 20.
What could be the problem?
My system configuration is:
Device 0: “Tesla M1060”
CUDA Driver Version: 3.10
CUDA Runtime Version: 3.10
Total amount of global memory: 4294770688 bytes
Number of multiprocessors: 30
Number of cores: 240
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes