GPUmat issue

I am having difficulty with installing GPUmat, and since there is limited documentation, I was wondering if anyone has encountered the issues bold.

Any suggestions would be greatly appreciated.

GPUstart

GPUmat, Copyright (C) 2012 GP-you Group (http://gp-you.org)

By using GPUmat, you accept all the terms and conditions
specified in the license.txt file.

Please send any suggestion or bug report to gp-you@gp-you.org.

Starting GPU

  • GPUmat version: 0.280
  • Required CUDA version: 4.1
    There are 2 devices supporting CUDA
    CUDA Driver Version: 4.10
    CUDA Runtime Version: 4.0

Device 0: “Tesla C2075”
CUDA Capability Major revision number: 2
CUDA Capability Minor revision number: 0
Total amount of global memory: 1341587456 bytes

Device 1: “Tesla C2075”
CUDA Capability Major revision number: 2
CUDA Capability Minor revision number: 0
Total amount of global memory: 1341587456 bytes

  • Your system has multiple GPUs installed
    → Please specify the GPU device number to use [0-1]:
    Invalid MEX-file ‘/N/u/glnxa64/bin/GPUmanagerCreate.mexa64’: /N/u/glnxa64/bin/GPUmanagerCreate.mexa64: undefined symbol: cublasCgemm
    Unable to load the kernels in file /N/u/glnxa64/cuda/cudalib20.cubin. Running system diagnostics.

    *** GPUmat system diagnostics
  • Running on → “glnxa64”
  • Matlab ver. → “7.14.0.739 (R2012a)”
  • GPUmat version → 0.280
  • GPUmat build → 07-Feb-2012
  • GPUmat architecture → “glnxa64”

*** ARCHITECTURE TEST
*** GPUmat architecture test → passed.

*** CUDA TEST
*** GPUmatSystemCheck INTERNAL ERROR. PLEASE REPORT TO gp-you@gp-you.org.

*** GPUmat device check
There are 2 devices supporting CUDA
CUDA Driver Version: 4.10
CUDA Runtime Version: 4.0

Device 0: “Tesla C2075”
CUDA Capability Major revision number: 2
CUDA Capability Minor revision number: 0
Total amount of global memory: 1341587456 bytes
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 32768
Warp size: 32
Maximum number of threads per block: 1024
Maximum sizes of each dimension of a block: 1024 x 1024 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 65535
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Clock rate: 1.15 GHz
Concurrent copy and execution: No
Run time limit on kernels: Yes
Integrated: No
Support host page-locked memory mapping: Yes
Compute mode: Unknown

Device 1: “Tesla C2075”
CUDA Capability Major revision number: 2
CUDA Capability Minor revision number: 0
Total amount of global memory: 1341587456 bytes
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 32768
Warp size: 32
Maximum number of threads per block: 1024
Maximum sizes of each dimension of a block: 1024 x 1024 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 65535
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Clock rate: 1.15 GHz
Concurrent copy and execution: No
Run time limit on kernels: Yes
Integrated: No
Support host page-locked memory mapping: Yes
Compute mode: Unknown
Error using GPUstart (line 160)
Unable to load the kernels in file /N/u/glnxa64/cuda/cudalib20.cubin.

While I have no experience with GPUmat (or using Matlab with GPUs for that matter), it sticks out that the message says you need CUDA 4.1 but only have runtime 4.0 installed…