I have serveral questions related to pgc++13.2. Suggestions are greatly welcome!
- I found that pgc++13.2 changed the resource configuration (e.g. number of block, number of thread) of OpenACC pragmas augmented code in runtime.
Feedback from the compiler shows me that the compiler agreed with me on resource configuration grid(200,100) block(32,16,1):
26, #pragma acc loop gang(100), vector(16) /* blockIdx.y threadIdx.y */ 28, #pragma acc loop gang(200), vector(32) /* blockIdx.x threadIdx.x */
But, when I profiled the program with nvprof, the resource configuration was changed to grid(256,512) block(32,16,1). Does pgc++13.2 noftify users on this change?
9e+09s 0ns (256 512 1) (32 16 1) 39 0B 0B - - 0 1 2 _Z8t1_f_acciiiPPdS0_S0_S0_S0_S0__30_ gpu
Base on which criteria, the compiler decides the appropriate resource configuration?
Which flags produce the following feedback from the compiler? Please tell me how the occupancy is calculated during the compile time.
CC 2.0 : 26 registers; 8 shared, 92 constant, 0 local memory bytes; 33% occupancy
Thank you very much,