Kernel Function

Im using Window7
Dual Core2
GTX 9800
as my platform. Somehow my function bellow shots out 0 as a result that should not happen( I have tested the result using host functions
which gave me other than 0)
following is my code

           if(x<XDIM){
		for(int y =0; y<YDIM;y++){
		int i = int(dev_resamp[y*2]);
		d_in[y].x = dev_b[dev_frame*YDIM*XDIM+x*YDIM+i]
				- dev_resamp[y*2+1]*(dev_b[dev_frame*YDIM*XDIM+x*YDIM+i+1]
				-dev_b[dev_frame*YDIM*XDIM+x*YDIM+i])/k_resampledspacing;
		d_in[y].y = 0;}			
	}

	else {
		for(int y =0; y<YDIM;y++){
			d_in[y].x = dev_b[dev_frame*YDIM*XDIM+x*YDIM+y];
			d_in[y].y = 0;
		}
	}

x was set to be threadIdx.x. Can somebody find a bug from this? thank you.

Im using Window7
Dual Core2
GTX 9800
as my platform. Somehow my function bellow shots out 0 as a result that should not happen( I have tested the result using host functions
which gave me other than 0)
following is my code

           if(x<XDIM){
		for(int y =0; y<YDIM;y++){
		int i = int(dev_resamp[y*2]);
		d_in[y].x = dev_b[dev_frame*YDIM*XDIM+x*YDIM+i]
				- dev_resamp[y*2+1]*(dev_b[dev_frame*YDIM*XDIM+x*YDIM+i+1]
				-dev_b[dev_frame*YDIM*XDIM+x*YDIM+i])/k_resampledspacing;
		d_in[y].y = 0;}			
	}

	else {
		for(int y =0; y<YDIM;y++){
			d_in[y].x = dev_b[dev_frame*YDIM*XDIM+x*YDIM+y];
			d_in[y].y = 0;
		}
	}

x was set to be threadIdx.x. Can somebody find a bug from this? thank you.

Ah… another blinding piece of code… I’m sorry, I’d like to help but my eyes really refuse to read that

Please check the error code after your kernel launch using cudaGetLastError and cudaGetErrorString. If there’s no error, you have Parallel Nsight and cudagdb to help you debug. Of course, you could even use printf to help you debug if you are too lazy to use any of the debugging tools.

Windows -> Parallel Nsight, Linux/Mac -> cuda-gdb

Thank you for the reply I actually dont know how to use the Parallel Nsight for debugging. Whenever the code stops
automatically it asks to use Visual Studio debugging tool.
Another thing that seems to bother me is that when I actually change the number of threads bellow 1000 the code
stops. Is there any errors related to this kind of stuff? Thank you.