Dear Forums,
I am trying to run CUDA on a headless RHEL 5.2 server box. It has a built-in ATI video card, and I have added a Quadro FX-4600 PCI-X board for CUDA. There is no X server running (and I prefer it that way).
I have downloaded and installed the driver, toolkit, and SDK that form the CUDA 2.0 release. When I run deviceQuery, things look good:
[codebox][root@ca3-1 release]# ./deviceQuery
There is 1 device supporting CUDA
Device 0: “Quadro FX 4600”
Major revision number: 1
Minor revision number: 0
Total amount of global memory: 805044224 bytes
Number of multiprocessors: 12
Number of cores: 96
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 8192
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 262144 bytes
Texture alignment: 256 bytes
Clock rate: 1.19 GHz
Concurrent copy and execution: No
Test PASSED
Press ENTER to exit…
[/codebox]
But when I run the other examples, they either fail or segfault, e.g.:
[codebox][root@ca3-1 release]# ./eigenvalues
Using device 0: Quadro FX 4600
Matrix size: 2048 x 2048
Precision: 0.000010
Iterations to be timed: 100
Result filename: ‘eigenvalues.dat’
Gerschgorin interval: -2.894310 / 2.923303
Average time step 1: 224.800690 ms
Average time step 2, one intervals: 224.823013 ms
Average time step 2, mult intervals: 112.424240 ms
Average time TOTAL: 786.905884 ms
Segmentation fault
[root@ca3-1 release]# ./BlackScholes
Using device 0: Quadro FX 4600
Initializing data…
…allocating CPU memory for options.
…allocating GPU memory for options.
…generating input data in CPU mem.
…copying input data to GPU mem.
Data init done.
Executing Black-Scholes GPU kernel (512 iterations)…
Options count : 8000000
BlackScholesGPU() time : 111.513420 msec
Effective memory bandwidth: 0.717402 GB/s
Gigaoptions per second : 0.071740
Reading back GPU results…
Checking the results…
…running CPU calculations.
Comparing the results…
L1 norm: 1.000000E+00
Max absolute error: 9.574021E+01
TEST FAILED
Shutting down…
…releasing GPU memory.
…releasing CPU memory.
Shutdown done.
Press ENTER to exit…
[/codebox]
One thing that caught my eye is that if I run glxinfo, it looks like the ATI OpenGL is running. Shouldn’t it be NVIDIA?
[codebox][root@ca3-1 release]# glxinfo | egrep -e ‘(client|server|OpenGL)’
server glx vendor string: SGI
server glx version string: 1.2
server glx extensions:
client glx vendor string: NVIDIA Corporation
client glx version string: 1.4
client glx extensions:
OpenGL vendor string: ATI Technologies Inc.
OpenGL renderer string: ATI Radeon 9200 OpenGL Engine
OpenGL version string: 1.3 ATI-1.5.36
OpenGL extensions:[/codebox]
Any help debugging from here would be hugely appreciated.