Help needed to get CUDA to work on my Mac Pro

Hi,

I’ve contacted the NVidia support about my problem and they sent me here so I hope that you guys can help me out getting CUDA to work on my Mac Pro. Here is my message to the support:

I have a Mac Pro 4.1, equipped as follows:

  • Quad-Core Intel Xeon @ 2.93 GHz
  • 8GB RAM
  • NVIDIA GeForce GT 120 (drivers: GPU: 1.6.37.0 (256.02.25f01), CUDA: 5.0.61)
  • Mac OS 10.6.8 Snow Leopard

I recently purchased a video editing plugin that got me into the idea of using CUDA to speed up the process of rendering. My problem is, that I can’t get CUDA to work.
I tried the latest driver (that I have currently installed), an older one that was suggested for Snow Leopard, as well as installing the CUDA Toolkit in addition (as suggested here: ⚓ T32752 Cuda on OSX Mac Book Pro Retina GT 650M) but unfortunately none of the tried methods worked.

I monitor the CUDA status trying my editing software as well as CUDA-Z which both tell me that there is no CUDA device installed in my computer. The CUDA button in the system preferences shows me both, the installed GPU and CUDA driver.

Any thoughts on how I can get the setup to work? I know the GT 120 is not the best bet performance wise but it should work, shouldn’t it?

Thanks a lot, any help is appreciated!

Malte

Can you please compile and run CUDA sample named “deviceQueryDrv”? The 256.xx driver is only compatible with CUDA 4.2. If you want to use CUDA 5.x, I suggest you to update to OSX 10.7.5 or later.

Thanks for your reply and sorry for my delay. This is all new to me, can you point me to a guide on how to compile the named sample?

I have no preference about the CUDA version, I just downloaded the suggested stuff from the NVidia website to try to use it for rendering. Unfortunately I can’t upgrade that machine right now since there is apps running that are not available for Lion yet.

Thanks a lot!

Install CUDA toolkit package by default
cd /Developer/NVIDIA/CUDA-5.5/Samples
make -C 1_Utilities/deviceQueryDrv
cd bin/x86_64/darwin/release
./deviceQueryDrv

So, since I couldn’t get to work what you posted I went ahead and removed everything CUDA related and installed the 4.2 packages again and did a re-install of XCode. After locating all the paths in the old install I could run the Query! =)
Cuda-Z still doesn’t recognise my GPU but the plugin does. Unfortunately the plugin is horribly slow on CPU and GPU as opposed to CPU only (probably a new Graphics due soon) which makes it kinda useless for now. Anyways, thanks a lot for your assistance, someone replying kept me going on the problem… =)

To complete this thread, this is my output for “deviceQuery”:

CUDA Device Query (Runtime API) version (CUDART static linking)

Found 1 CUDA Capable device(s)

Device 0: “GeForce GT 120”
CUDA Driver Version / Runtime Version 4.2 / 4.2
CUDA Capability Major/Minor version number: 1.1
Total amount of global memory: 512 MBytes (536543232 bytes)
( 4) Multiprocessors x ( 8) CUDA Cores/MP: 32 CUDA Cores
GPU Clock rate: 1400 MHz (1.40 GHz)
Memory Clock rate: 800 Mhz
Memory Bus Width: 128-bit
Max Texture Dimension Size (x,y,z) 1D=(8192), 2D=(65536,32768), 3D=(2048,2048,2048)
Max Layered Texture Size (dim) x layers 1D=(8192) x 512, 2D=(8192,8192) x 512
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 8192
Warp size: 32
Maximum number of threads per multiprocessor: 768
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 2147483647 bytes
Texture alignment: 256 bytes
Concurrent copy and execution: Yes with 1 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Concurrent kernel execution: No
Alignment requirement for Surfaces: Yes
Device has ECC support enabled: No
Device is using TCC driver mode: No
Device supports Unified Addressing (UVA): No
Device PCI Bus ID / PCI location ID: 5 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 4.2, CUDA Runtime Version = 4.2, NumDevs = 1, Device = GeForce GT 120

Yeah, compared to your quad-core Xeon CPU, a GT 120 is not particularly fast.