CUDA Pointer Dereferencing issue integer Pointer Dereferencing in __global__ function halts the PC

I am developing a program using cuda sdk 3.1, and GT-240, 1 GB NVidia Card . In this program

[list=1]

A kernel passes a pointer of 2D int array of size 3000x6 in its input arguments. this input array has already been pass through cudamalloc and cudaMemCopy.

The kernel has to sort it up to 3 levels (1st, 2nd & 3rd Column).

For this purpose, the kernel declares an array of integer pointers of size 3000.

The kernel then populates the pointer array with the pointers pointing to the locations of input array in sorted order.

Finally the kernel copies the input array in an output array by de-referencing the pointers array.

This last step Fails an it halts the PC.

Q1) What are the guidelines of pointer de-referencing inside a kernel (global function) ?

, even a smallest array of 20x2 is not working correctly in this way . the same code works well outside kernel i,e. on standard C program

Q2) Isn’t it supposed to work the same way as we do in standard C using ‘*’ operator or there is some cuda api to be used for it.?

The rules are the same as in standard C, apart from the facts that we have different address spaces for host and device and different types of memory on the device.

Can you post the kernel code?

The rules are the same as in standard C, apart from the facts that we have different address spaces for host and device and different types of memory on the device.

Can you post the kernel code?