NSight 2.2.012313 not stopping on breakpoints in kernel code? VS 2010 Ultimate SP1, Win 7 Ultimate 6

Hello, all
Can you help me? I am out of ideas.
Here is my kernel:

__global__ void rotatePicture90Degress(unsigned char *original, unsigned char *rotated)
{
	// Find the row of data item
	int x = threadIdx.x + blockDim.x*blockIdx.x;
	// Find the column of data item
	int y = threadIdx.y + blockDim.y*blockIdx.y;
	// Calculate exact linear offset based on row and column from previous two lines for original matrix
	int offset = x + y*blockDim.x * gridDim.x;
	// Calculate offset for second, rotated matrix 
	int offset2 = y+x*blockDim.x * gridDim.x;

	/*if(offset < 1600*1200*3 && offset2 < 1600*1200*3)*/
		rotated[offset2]=original[offset];

}

Here is a call from main():

rotatePicture90Degress<<<BlockPerGrid,ThreadsPerGrid>>>(dev_cuda_image,dev_cuda_90_degress);

and dimensions of a kernel call:

dim3 BlockPerGrid(450,400,0);
	dim3 ThreadsPerGrid(8,4,0);

Can you tell me why is it CUDA NSight not stopping on this line in krenel:

rotated[offset2]=original[offset];

Is it stopping on any other lines? If not, turn on GPU debug symbols (-G).

i guess the z dimension of your kernel configuration parameters should be 1.