Small CPU algo doesnt work with CUDA

Hi everybody,

I have got a problem with a cubic spline interpolation algorithm.

I have to interpolate some points with a Cuda Simulation, so i make use of a small

part of code,

that i already use with my CPU version. With CPU, it works great.

With Cuda, the interpolation make my gpu driver down .

Here is the code that doesn’t work with cuda:

[codebox]device double cubicspline_inter(double *x,

								double *y,

								double *der2,

								double intx,

								int dimX)


int klo,khi,k;

double h,b,a;

int n = dimX;



while (khi-klo > 1) {

	k=(khi+klo) >> 1;

	if (x[k] > intx) khi=k;

	else klo=k;





if (intx<x[0])

	return y[0];

else if (intx>x[dimX-1])

	return y[dimX-1];


	return a*y[klo]+b*y[khi]+((a*a*a-a)*der2[klo]+(b*b*b-B)*der2[khi])*(h*h)/6.0;




If you have any idea about it…

Too registers ? Too operations ?

My arrays are correctly allocated, within a global function that call this device function, so i have not got any index out of range.



“Doesn’t work” means slow? Or it returns the wrong answers?

Your code will be horribly inefficient on the GPU due to the random, uncoalesced, and even repeated memory accesses.
The efficient way of searching such an array likely depends on the size of the array, whether multiple threads are searching the same range, and whether the data can be rearranged from a flat list to a hierarchical table instead of a flat list.

“Doesn’t work” means “i have to reboot my computer, because the graphic driver freezes”.

I know it is not efficient but my first objective was to take it naively, in order to have a reference.

Otherwise, i have already found an optimized version of cubic spline with Cuda, given by Danny Ruijters.

I found my bug.
It has nothing to do with Cuda but with the class provided with it.
An array was empty so Cuda crashed.
i rushed myself at blame Cuda before the programer who given me the class… Error
An array was not allocated.