converting for loops to CUDA


I have some C++ code that I want to CUDAize, it looks like:

for(*ihour = 0; *ihour < 24; *ihour++) {

		for(*i = 0; *i < 15000; *i++) {


		code omitted




and I want to parallelize it (of course as efficiently as possible).

I am not familiar with the blocks and grids, but this is how I have set it up so far:

*ihour = blockIdx.x * blockDim.x + threadIdx.x;

*i = blockIdx.x * blockDim.x + threadIdx.x;

and my kernel:

dim3 dimGrid(1500,1) dimBlock(10,24);


is this correct at all?

thanks CUDA people