I’m trying to get a few machines set up for CUDA development. They are running CentOS 5.3 64-bit with the latest graphics drivers.
I installed the SDK and toolkit just fine, as far as I know, and managed to get the examples built with the makefile.
On two different machines I’m getting different problems, however…
On one machine with a CUDA enabled card (Quadro FX 5800), I get the following error: libcudart.so is found but not in a recognized format.
On another machine with a non-CUDA enabled card (Quadro FX 4500), I get a segfault whenever the program tries to run a cuda line (cudaMalloc, cudaMemcpy, etc.).
I can run the deviceQuery program on both, and they work as they should (the first machine reports the card’s stats, and the second reports a non-CUDA enabled card).
Any ideas as to what I’m missing? Or are there any instructions that I may have missed, specific to the OS or 64-bit architecture?
That probably means you have unwittingly installed the 32 bit version of the toolkit, or it is corrupted somehow. I would reinstall the toolkit as a first step. Centos 64 bit 5.3 works out of the box with CUDA - I have a whole cluster of compute nodes running on it, so I very much doubt your problem is related to the choice of OS or architecture.
I installed the RHEL 5.3 64-bit toolkit (cudatoolkit_2.3_linux_64_rhel5.3.run). I’ve tried including /usr/local/cuda/lib64 in my LD_LIBRARY_PATH before I compile, but the problem persists.
CentOS 5 is at 5.4 now. I would do a “yum upgrade” to CentOS 5.4 first, then reinstall CUDA. We also have a GPU cluster using CentOS 5 with CUDA 2.3 and it works great.
That has no effect on compilation. LD_LIBRARY_PATH is only a runtime setting. Check the makefile you are using to make sure it is proving the correct library path to the compiler/linker.
Upgrading to Centos 5.4 will have no effect on this and I would not recommend it.