CUDA HELP

rchan101 · August 16, 2010, 4:49pm

Hi guys,

I am trying to run a CUDA piece of code which is a simple Hello World program on the NVIDIA Tesla. I know for certain that the NVIDIA Tesla device is connected as when I try /bin/lspci it shows a long list of connected devices ending with the Tesla units which ends with the code seen below.
30:00.0 3D controller: nVidia Corporation G80 [Tesla C870] (rev a2)

However when I try running the basic Hello world program that is supposed to mangle the hello world in the device(GPU) and then print the mangled gpu code on the host, it never gets mangled. The hello world that I am trying to get working is the first example found on: [url=“The Official NVIDIA Forums | NVIDIA”]The Official NVIDIA Forums | NVIDIA
I know that the code for the Hello World is correct.

So in short, even though when I do nvcc blah.cu there is no error; when I try and execute ./a.out the output is never mangled, implying that the gpu code is never executed Only the CPU code is functional

Am I supposed to be specifying the port number to which the Tesla is connected or does the computer automatically know it? if so how could ispecify the port number? All the examples that I see online seem to assume that the system knows which port the GPU is connected to.

Also wrapper functions such as CUDA_SAFE_CALL do not seem to work. Is the because wrapper functions need a special header file to be included in the code?

Thanks in advance for any help!

avidday · August 16, 2010, 5:52pm

Do you have the driver installed, and if so what versions of the toolkit and driver are you using?

avidday · August 16, 2010, 5:52pm

Do you have the driver installed, and if so what versions of the toolkit and driver are you using?

rchan101 · August 16, 2010, 6:19pm

The toolkit version is 3.1
nvcc -V
nvcc: NVIDIA ® Cuda compiler driver
Copyright © 2005-2010 NVIDIA Corporation
Built on Mon_Jun__7_18:56:31_PDT_2010
Cuda compilation tools, release 3.1, V0.2.1221

When I tried the command nvidia-settings to figure out the driver version i get
nvidia-settings: error while loading shared libraries: libgtk-x11-2.0.so.0: cannot open shared object file: No such file or directory

I think that may be because the sdk is not installed

rchan101 · August 16, 2010, 6:19pm

The toolkit version is 3.1
nvcc -V
nvcc: NVIDIA ® Cuda compiler driver
Copyright © 2005-2010 NVIDIA Corporation
Built on Mon_Jun__7_18:56:31_PDT_2010
Cuda compilation tools, release 3.1, V0.2.1221

When I tried the command nvidia-settings to figure out the driver version i get
nvidia-settings: error while loading shared libraries: libgtk-x11-2.0.so.0: cannot open shared object file: No such file or directory

I think that may be because the sdk is not installed

avidday · August 16, 2010, 7:34pm

Nothing to do with the SDK - it means gtk isn’t installed. You can get the driver version by executing

cat /proc/driver/nvidia/version

from any shell/command line.

avidday · August 16, 2010, 7:34pm

Nothing to do with the SDK - it means gtk isn’t installed. You can get the driver version by executing

cat /proc/driver/nvidia/version

from any shell/command line.

rchan101 · August 16, 2010, 8:05pm

The output is:
cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module 256.44 Thu Jul 29 01:22:44 PDT 2010
GCC version: gcc version 4.1.2 20080704 (Red Hat 4.1.2-46)

rchan101 · August 16, 2010, 8:05pm

The output is:
cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module 256.44 Thu Jul 29 01:22:44 PDT 2010
GCC version: gcc version 4.1.2 20080704 (Red Hat 4.1.2-46)

avidday · August 16, 2010, 8:35pm

That looks ok. If you have /dev/nividia* entries with appropriate permissions it should work. At this point it might be worth getting the SDK and try building and running the deviceQuery sample.

avidday · August 16, 2010, 8:35pm

That looks ok. If you have /dev/nividia* entries with appropriate permissions it should work. At this point it might be worth getting the SDK and try building and running the deviceQuery sample.

rchan101 · August 16, 2010, 8:42pm

I had executed deviceQuery from [url=“CUDA Device Query - GPU Coder — LiveJournal”]http://gpucoder.livejournal.com/1064.html[/url]
and I got 0 CUDA devices connected when clearly it is connected!
AWhat is the /dev/nvidia folder you talk about? I dont think I have that

rchan101 · August 16, 2010, 8:42pm

I had executed deviceQuery from [url=“CUDA Device Query - GPU Coder — LiveJournal”]http://gpucoder.livejournal.com/1064.html[/url]
and I got 0 CUDA devices connected when clearly it is connected!
AWhat is the /dev/nvidia folder you talk about? I dont think I have that

avidday · August 16, 2010, 8:53pm

This:

avidday@cuda:~$ ls -l /dev/nvidia*

crw-rw-rw- 1 root root 195,   0 2010-08-15 09:24 /dev/nvidia0

crw-rw-rw- 1 root root 195,   1 2010-08-15 09:24 /dev/nvidia1

crw-rw-rw- 1 root root 195, 255 2010-08-15 09:24 /dev/nvidiactl

If you don’t have those, something is wrong. Are you running X11? If not, you need to (re)read the toolkit release notes. There is information about setting up an init script which will create those device file entries at boot time.

avidday · August 16, 2010, 8:53pm

This:

avidday@cuda:~$ ls -l /dev/nvidia*

crw-rw-rw- 1 root root 195,   0 2010-08-15 09:24 /dev/nvidia0

crw-rw-rw- 1 root root 195,   1 2010-08-15 09:24 /dev/nvidia1

crw-rw-rw- 1 root root 195, 255 2010-08-15 09:24 /dev/nvidiactl

If you don’t have those, something is wrong. Are you running X11? If not, you need to (re)read the toolkit release notes. There is information about setting up an init script which will create those device file entries at boot time.

rchan101 · August 17, 2010, 5:08pm

Thanks, that helped a lot to identify the problem. The instructions that I followed on adding the script to the .profile file in order to initiate x11 automatically didn’t work. I wonder why CUDA never shows an error when i do nvcc blah.cu and then ./a.out. It just doesn’t do anything in the CUDA device. Is there an easier way to launch the application without admin rights? Admin rights take a while to get approved.

rchan101 · August 17, 2010, 5:08pm

Thanks, that helped a lot to identify the problem. The instructions that I followed on adding the script to the .profile file in order to initiate x11 automatically didn’t work. I wonder why CUDA never shows an error when i do nvcc blah.cu and then ./a.out. It just doesn’t do anything in the CUDA device. Is there an easier way to launch the application without admin rights? Admin rights take a while to get approved.

rchan101 · August 17, 2010, 6:34pm

Thanks so so much avidday, my code is connected to the CUDA device all thanks to you! You’re a genius :]

rchan101 · August 17, 2010, 6:34pm

Thanks so so much avidday, my code is connected to the CUDA device all thanks to you! You’re a genius :]