Memory access violation to global memory.

I have a problem in the following code. I used dimGrid(1,1,1), dimBlock(21,21,0) in executing the kernel. The following code is where there are some memory access violations. I use the nsight debugging in VS2012, and the breakpoint is at
static forceinline int __float2int_rz(float in)
return __nv_float2int_rz(in);
int the device_function.h… So wired. Could any one help me on this? Many thanks in advance.

fInputa_dR[254254], fInputRx[254254], fInputRy[254254], iSubsetW = iSubsetH = 33, iNumberX = iNumberY = 21, fRDescent[33336], fdPXY[2121*2].

//Fill the gray intensity value to subset R
for(int l=0; l<iSubsetH; l++){
	for(int m=0; m<iSubsetW; m++){
		fSubsetR[l*iSubsetW+m] = fInput_dR[int(fdPXY[0] - iSubsetY+l)*width+int(fdPXY[1] - iSubsetX+m)];
		fSubAveR += (fSubsetR[l*iSubsetW+m]/float(iSubsetH * iSubsetW));
		//Evaluate the Jacobian dW/dp at(x,0)
		fJacobian[0][0] = 1.0, fJacobian[0][1] = float(m-iSubsetX), fJacobian[0][2] = float(l-iSubsetY), fJacobian[0][3] = 0.0, fJacobian[0][4] = 0.0, fJacobian[0][5] = 0.0;
		fJacobian[1][0] = 0.0, fJacobian[1][1] = 0.0, fJacobian[1][2] = 0.0, fJacobian[1][3] = 1.0, fJacobian[1][4] = float(m-iSubsetX), fJacobian[1][5] = float(l-iSubsetY);
		for(int k=0; k<6; k++){
			fRDescent[(l*iSubsetW+m)*6+k] = fInput_dRx[int(fdPXY[0] - iSubsetY+l)*width+int(fdPXY[1] - iSubsetX+m)]*fJacobian[0][k]
												   +fInput_dRy[int(fdPXY[0] - iSubsetY+l)*width+int(fdPXY[1] - iSubsetX+m)]*fJacobian[1][k];
		for(int k=0; k<6; k++){
			for(int n=0; n<6; n++){
				fHessianXY[k][n] = fRDescent[(l*iSubsetW+m)*6+k] * fRDescent[(l*iSubsetW+m)*6+n];//Hessian matrix at each point
				fHessian[k][n] += fHessianXY[k][n];

Comment out lines until the violation goes away.

Sorry, but I’m not going to mentally parse your code to ponder memory boundaries.

Expand your code to seperately compute those complex array indices and store them into an index variable.
That will allow you to see if they’re going out of range more easily in the debugger. You can also set some asserts on the index value to make sure it’s not negative or beyond your array allocation size.

Something like:

int indexX=int(fdPXY[0] - iSubsetY+l)*width+int(fdPXY[1] - iSubsetX+m);
assert(indexX>=0 && indexX< INPUT_ARRAY_SIZE);

... fInput_dRx[indexX] ...

You could just examinine the expression in the debugger, but seeing the subexpression for the indices explicitly will make it easier to spot.