CUDA driver version is insufficient for CUDA runtime version

I am evaluating the trial version of the CUDA Fortran compiler and trying to execute the matrix multiplication sample problem: matmul_drv.F90 and matmul1.cuf and after successful compilation, I get the error message:

Starting host calculation.
0: ALLOCATE: 1600 bytes requested; status = 35(CUDA driver version is insufficient for CUDA runtime version)

The array size is only 20 by 20.
What could be the problem?

My system configuration is:

Device 0: “Tesla M1060”
CUDA Driver Version: 3.10
CUDA Runtime Version: 3.10
Total amount of global memory: 4294770688 bytes
Number of multiprocessors: 30
Number of cores: 240
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes

You should upgrade your driver.
What is the output of “cat /proc/driver/nvidia/version”?

Which version of CUDA Fortran and which flags are you using to compile?

The version is here:

cat /proc/driver/nvidia/version

NVRM version: NVIDIA UNIX x86_64 Kernel Module  280.13  Wed Jul 27 16:53:56 PDT 2011

GCC version:  gcc version 4.6.1 (Ubuntu/Linaro 4.6.1-9ubuntu3) 

avallark@crunch:~$ lsmod | grep nvidia

nvidia              11713772  44

Any help would be highly appreciated. I am trying to run gputools in R and I am getting this error:

> gpuCor(A, B, method="pearson")

Error in gpuCor(A, B, method = "pearson") : 

  CUDA driver version is insufficient for CUDA runtime version

In addition: Warning message:

In gpuCor(A, B, method = "pearson") : PMCC function : malloc and memcpy

What is the toolkit version?
If you have 4.1 installed, you will need a 285 driver. You can install an old toolkit on a new driver, but not viceversa.