about -Minfo

After I use -Minfo , there is a output: # pragma acc for parallel, the vector (256) / * blockIdx.x threadIdx.x * /

For this output, What does 256 means? How do you know the compiler allocates a block and how many threads allocated for each block ?

What does 256 means? How do you know the compiler allocates a block and how many threads allocated for each block ?

It’s the vector width and translates into CUDA as the thread block size. “parallel” translates to CUDA as the block. So in this case you have a 1-D Grid containing a 1-D Block containing 256 threads. Since “parallel” does not have a width, the number of blocks launched will be determined dynamically at run time depending upon the size of the loop.

/ * blockIdx.x threadIdx.x * /

This indicates the exact CUDA dimension being used.

Hope this helps,
Mat

Thanks a lot.