Cannot run SDK samples ('kernel execution failed')

I’m having problems getting the SDK 2.1 samples to compile and run on Fedora 10.

My setup:


Intel D915GAG mobo

Pentium 4 2.8GHz

3GB RAM (DDR400, not running in dualchannel mode)

GeForce 6200 (which of course doesn’t support CUDA but I’ll list anyway for completeness)

Tesla C870


Fedora 10

gcc 4.3

Nvidia driver 177.82 installed from the RPMFusion repository

Toolkit ‘cuda-linux-rel-nightly-2.1.1635-3065709’

SDK ‘cuda-sdk-linux-2.10.1126.1520-3141441’

packages freeglut and freeglut-devel 2.4

My compiling problems:

The projects threadMigration, matrixMulDrv and simpleTextureDrv will not compile. I get this error:

make[1]: Entering directory `/home/frank/NVIDIA_CUDA_SDK/projects/simpleTextureDrv'

/usr/bin/ld: cannot find -lcuda

collect2: ld returned 1 exit status

make[1]: *** [../../bin/linux/release/simpleTextureDrv] Error 1

make[1]: Leaving directory `/home/frank/NVIDIA_CUDA_SDK/projects/simpleTextureDrv'

make: *** [projects/simpleTextureDrv/Makefile.ph_build] Error 2

I have looked at this topic: error: cannot find -lcuda, but it doesn’t solve my problem. I think I have all files, links and paths correct, as shown by the following printouts:

[frank@localhost ~]$ ls /usr/local/cuda/lib

[frank@localhost ~]$ ls /usr/lib/nvidia  tls

[frank@localhost ~]$ set | grep PATH



[frank@localhost ~]$ ldconfig -p | grep cuda (ELF) => /usr/lib/ (libc6) => /usr/local/cuda/lib/ (libc6) => /usr/local/cuda/lib/ (libc6) => /usr/local/cuda/lib/ (libc6) => /usr/local/cuda/lib/ (libc6) => /usr/local/cuda/lib/ (libc6) => /usr/local/cuda/lib/ (libc6) => /usr/lib/nvidia/ (libc6) => /usr/lib/nvidia/ (libc6) => /usr/local/cuda/lib/ (libc6) => /usr/local/cuda/lib/ (libc6) => /usr/local/cuda/lib/ (libc6) => /usr/local/cuda/lib/

After removing the three projects that don’t compile, the rest compiles fine (barring a few compiler warnings). However, when I proceed to run one of the programs, I get the following error messages (I have run all programs, I have pasted all the different messages I get):


CUDA FFT Ocean Simulation

Left mouse button		  - rotate

Middle mouse button		- pan

Left + middle mouse button - zoom

'w' key					- toggle wireframe

cudaSafeCall() Runtime API error in file <oceanFFT.cpp>, line 273 : unknown error.


Allocating memory...

Generating host input data array...

Uploading input data to GPU memory...

Testing misaligned types...


cutilCheckMsg() CUTIL CUDA error: testKernel() execution failed

 in file <>, line 223 : invalid device function .


time spent executing by the GPU: 166.93

time spent by CPU in CUDA calls: 0.08

CPU executed 14899 iterations while waiting for GPU to finish




Loaded 'lena_bw.pgm', 512 x 512 pixels

cudaSafeCall() Runtime API error in file <>, line 500 : invalid texture reference.


Using single precision...

Using device 0: Tesla C870

Generating input data...

Running GPU binomial tree...

cudaSafeCall() Runtime API error in file <binomialOptions_kernel.cuh>, line 187 : invalid device symbol.


Allocating host and CUDA memory and loading image file...

Loading ./../../../projects/imageDenoising/data/portrait_noise.bmp...

BMP width: 320

BMP height: 408

BMP file loaded successfully!

Data init done.

Initializing GLUT...

Loading extensions: No error

OpenGL window created.

Creating GL texture...

Texture created.

Creating PBO...

cudaSafeCall() Runtime API error in file <imageDenoisingGL.cpp>, line 407 : unknown error.


This sample needs a card capable of OpenGL and display.

Please choose a different device with the -device=x argument.


simpleCUBLAS test running..

!!!! kernel execution error.


cufft: ERROR: /root/cuda-stuff/sw/rel/gpgpu/toolkit/r2.1/cufft/src/, line 1070


cufft: ERROR: /root/cuda-stuff/sw/rel/gpgpu/toolkit/r2.1/cufft/src/, line 151


cufftSafeCall() CUFFT error in file <>, line 127.

deviceQuery and bandwidthTest run fine:

There is 1 device supporting CUDA

Device 0: "Tesla C870"

  Major revision number:						 1

  Minor revision number:						 0

  Total amount of global memory:				 1610350592 bytes

  Number of multiprocessors:					 16

  Number of cores:							   128

  Total amount of constant memory:			   65536 bytes

  Total amount of shared memory per block:	   16384 bytes

  Total number of registers available per block: 8192

  Warp size:									 32

  Maximum number of threads per block:		   512

  Maximum sizes of each dimension of a block:	512 x 512 x 64

  Maximum sizes of each dimension of a grid:	 65535 x 65535 x 1

  Maximum memory pitch:						  262144 bytes

  Texture alignment:							 256 bytes

  Clock rate:									1.35 GHz

  Concurrent copy and execution:				 No


I do know that Fedora 10 is not officially supported, but it did run on my Fedora 8 installation (with Toolkit and SDK 1.1). I really have no idea how to proceed from this point. Anyone?

Install Fedora Core 9?

Well, I’d like to try it on 10 first. Even though 10 is not officially supported, doesn’t mean that it absolutely won’t work.

Well, your post seems to indicate it does not work ;)
One thing I noticed is that all projects that do not want to compile are using the driver API.
Projects using OpenGL will likely not run for you, as there was some change in 2.1. For me they only run when using a GPU that can so OpenGL itself (in my case 8800GTX). Running on a Tesla will not work at this time for these samples (I think the 2.1 SDK will fix that)

One way to make it work in FC10 could be to install the gcc from FC9. But I am afraid that will open up a whole can of worms with libraries and stuff, so if you do not need new features from FC10…

Good to see more people from NL doing CUDA. Are you doing medical imaging? (I was doing MRI stuff when I was at uni and have seen lots of CUDA related papers about MRI the last year and a half)

I am also running Fedora 10, I have success compiling the files as well as running the executables so it does work to a degree. What I noticed in your code that is different from mine is that you are declaring you paths differently.


export PATH

is how I do it. Then go to the project you want to compile and make sure you make a release version using make release=1. Note you path must be where the MakeFile is for you project. Sometimes when you try to make it you will get Error 1 or Error 2. Run you paths again and you should be fine.
Now the only thing is when I get the API projects to compile I only get one frame per second lol. I will let you know if I make progress. So I think the previous post might be right… but I won’t give up on Fedora 10 yet and neither should YOU!