After installing the CUDA driver, everything went fine.
Here are some results for those interested, from deviceQuery and bandwidthTest
% ./deviceQuery
./deviceQuery Starting…
CUDA Device Query (Runtime API) version (CUDART static linking)
There is 1 device supporting CUDA
Device 0: “GeForce GT 330M”
CUDA Driver Version: 3.0
CUDA Runtime Version: 3.0
CUDA Capability Major revision number: 1
CUDA Capability Minor revision number: 2
Total amount of global memory: 536543232 bytes
Number of multiprocessors: 6
Number of cores: 48
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 16384
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 2147483647 bytes
Texture alignment: 256 bytes
Clock rate: 1.10 GHz
Concurrent copy and execution: Yes
Run time limit on kernels: Yes
Integrated: No
Support host page-locked memory mapping: Yes
Compute mode: Default (multiple host threads can use this device simultaneously)
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 53931, CUDA Runtime Version = 3.0, NumDevs = 1, Device = GeForce GT 330M
PASSED
% ./bandwidthTest --memory=pinned
[bandwidthTest]
./bandwidthTest Starting…
Running on…
Device 0: GeForce GT 330M
Quick Mode
Host to Device Bandwidth, 1 Device(s), Pinned memory, Write-Combined Memory Enabled
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 2640.5
Device to Host Bandwidth, 1 Device(s), Pinned memory, Write-Combined Memory Enabled
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 3206.5
Device to Device Bandwidth, 1 Device(s)
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 8534.0
[bandwidthTest] - Test results:
PASSED