Howto maximize blocks and grid for Jetson Xavier

How do I determine how to maximize these blocks and grid for Jetson Xavier? Thanks in advance!

Example
dimBlock=dim3(32,32);
int yBlocks = eglFrame.height/dimBlock.y+((eglFrame.height%dimBlock.y)==0?0:1);
int xBlocks = eglFrame.width/dimBlock.x+((eglFrame.width%dimBlock.x)==0?0:1);
dimGrid=dim3(xBlocks,yBlocks);

Hi,

You can get this information from the deviceQuery sample:

$ ./deviceQuery
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "Xavier"
  CUDA Driver Version / Runtime Version          11.4 / 11.4
  CUDA Capability Major/Minor version number:    7.2
  ...
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
...

Thanks.

Number of thread block size that works for me is 32,32. What else would need to be changed when I change these values? I get this message if I use anything bigger(64,64 for example).
Cuda failure nvsample_cudaprocess.cu:945: invalid configuration argument. But I believe this is a side effect. How does this effect the grid size allowed? Here is my code snippet.

dimBlock=dim3(32,32); // this works fine using 64,64 fails
int yBlocks = eglFrame.height/dimBlock.y+((eglFrame.height%dimBlock.y)==0?0:1);
int xBlocks = eglFrame.width/dimBlock.x+((eglFrame.width%dimBlock.x)==0?0:1);
dimGrid=dim3(xBlocks,yBlocks);

Hi,

Based on the deviceQuery, the maximum #threads is 1024.
Block with dim(64, 64) has 64x64=4096 threads which is already over the limit.

Thanks.

Hi,

Got it, thanks.

Tom

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.