Problem about Grid-Block-Thread Dimension

Hi all, i have a problem about allocate dimension and managed of grid block and threads.

Therefore, i have GeForce 9300M GS. This device have:

Grid Dimension: 65565x 65565y 1z
Threads Dimension: 512x 512y 64z
Threads per Block: 512

Why Cuda don’t me allow to allocate max number of Grid Dimension or Block Dimension? And how can i do to know in runtime max number of possibile allocation?

Then, how can i know GRIDIDX.X? There is GridDim, BlockDim, BlockId.x-y-z, ThreadIdx.x-y-z, but GRIDIDX.X-Y-Z?

Sorry for my probable stupid question and sorry for my bad english.

Thanks for all answer

Hi, i have started with cude just today, but i am reading the Programming Guide and just get past the Thread hirachy chapter, which fits to yout questions

  • i dont know, are you sure about the numbers? maybe your code would help here.

+dont get it, sry

  • i dont think you can adress just the grid, but you can adress each block within the grid and by that also each thread.

mh, i hope i could give you a hint.

Hi,

If you are writting and application that needs to be able to work on a range of Nvidia cards then from your host code call cudaGetDeviceProperties. This will return the maximum grid size etc that you need.

cudaGetDeviceProperties is described in the CUDA reference manual you can download from the NVidia web site

Note the maximum of threadIdx.x * threadIdx.y * threadIdx.z is 512

In my programs I have found the maximum number of blocks I can run is determined by the size of my arrays in global memory. e.g. if I have 10 arrays each of 200MB, thats 50m cells each, and with 512 threads per block that means I only need 100,000 blocks so a grid of about 3200 x 3200 x 1

In case it helps you here is some simpler code from a test program I was running recently (Not related to above example)

define ThreadsPerBlock 256

define COLUMNS 20192

define ROWS 18028

int BlocksNeeded = (COLUMNS+ThreadsPerBlock -1)/ThreadsPerBlock ;

dim3 dimGrid( BlocksNeeded );

dim3 dimBlock( ThreadsPerBlock );

I dont think there is a GRIDIDX but you can calculate your own as

x = blockIdx.x * blockDim.x + threadIdx.x;

Here is another example for a 2D Grid (from the CUDA Programming Guide)

dim3 Db = dim3(16, 16); // dimensions of block

dim3 Dg = dim3((width+Db.x–1)/Db.x, (height+Db.y–1)/Db.y); // dimensions of grid

// and in the kernel

int x = blockIdx.x * blockDim.x + threadIdx.x;

int y = blockIdx.y * blockDim.y + threadIdx.y;

There’s no gridIdx as there’s only one grid of blocks. You can think it’s there but it’s always 0 :D