No result after increasing the size of Grid or Block

Dear all,

I encounter a problem that the output will be incorrect after increasing the size of Grid or Block.
My setting is pasted as follow:

const int BLOCK = 512;

int THREAD = static_cast(ceilf(sqrt(static_cast(VOLEX_RESOLUTION * VOLEX_RESOLUTION * VOLEX_RESOLUTION) / BLOCK)));

dim3 grid(THREAD, THREAD, 1);
dim3 threads(BLOCK, 1, 1);

gpu_depthGridCalculation << <grid, threads >> > (dev_srcImgs, dev_cameraParameters, dev_volexGrid, imgRows, imgHeight, paraRows, paraCols);

int tid = threadIdx.x + threadIdx.y * blockDim.x + threadIdx.z * blockDim.x * blockDim.y + (blockIdx.x + blockIdx.y * gridDim.x + blockIdx.z * gridDim.x * gridDim.y) * blockDim.x * blockDim.y * blockDim.z;

When I set “BLOCK = 512, VOLEX_RESOLUTION = 200”, the result is correct.
However, when I set “BLOCK = 1024, VOLEX_RESOLUTION = 200” and “BLOCK = 512, VOLEX_RESOLUTION = 300”, there are no any output. It seems the GPU processing does not work at all.

Is there anyone can give me some clues for this problem?
Thank you very much for the help.

What happens if you run your code with cuda-memcheck in the failing case?

Are you doing proper cuda error checking in your code?

It looks like you are on windows, so you may also be running into a WDDM TDR event when your kernel exection is too long.

Dear txbob

Thank you very much for your reply.

Your comment is very helpful for me.
The reason is clear after using CUDA error checking.
Thank you for your help.