converting for loops to CUDA

Hello,

I have some C++ code that I want to CUDAize, it looks like:

for(*ihour = 0; *ihour < 24; *ihour++) {

		for(*i = 0; *i < 15000; *i++) {

		....

		code omitted

		...

		}

}

and I want to parallelize it (of course as efficiently as possible).

I am not familiar with the blocks and grids, but this is how I have set it up so far:

*ihour = blockIdx.x * blockDim.x + threadIdx.x;

*i = blockIdx.x * blockDim.x + threadIdx.x;

and my kernel:

dim3 dimGrid(1500,1) dimBlock(10,24);

prog<<<dimGrid,dimBlock>>>

is this correct at all?

thanks CUDA people