Grid size problems incorrect computation results

Hi!

I am generating fractals using CUDA

in the kenel I determine the offsets as follows:

int x = blockIdx.x*blockDim.x + threadIdx.x;

    int y = blockIdx.y*blockDim.y + threadIdx.y;

    int offset = x + y*blockDim.x*gridDim.x; //in bitmap

and I launch the kernel:

const int tpb = 64; //threads per block

	dim3 grid(w/tpb,h); //w and h are the dimensions of the fractal

	dim3 block(tpb,1);

...

kernel<<<grid,block>>>(...)

The problem is, that if I divide the h variable too, lets say by 64 (I wanted to do this in the first place!) then i get a whole black picture, instead a beautiful fractal.

On the other hand, if I launch the kernel like stated above (w/tpb, h) then I receive a correct picture.

What is the matter?

Is it, possibly the MAX_THREADS_PER_BLOCK parameter (which in my card is 1024 - and 64x64 is 4096) ?

Regards

After you have correctly identified the cause of the problem, I wonder what makes you ask in the forum first instead of just fixing it?