There are several parameters outputted by the CUDA Visual Profiler tool. I am interested in knowing the relative strength or weakness of each of some parameters in the attached excel sheets of a sobel edge detection code, made in three versions:
using shared mem only using int
using texture and shared memories with unsigned char
using texture and shared memories with uchar3
The three excel sheets are attached. I would like to judge which of my codes is better than the other. But I completely don’t know what parameters form the basis for this judgement. I would like to primarily know how the following parameters affect the performance of a code (meaning, should the value be less or more for better performance for each of the following parameters):
1.) memory transfer size
2.) warp serialize
3.) Occupancy
4.) global mem overall throughput
5.) gld efficiency
6.) gst efficiency
7.) Instruction Throughput
Note: The attached excel sheets show only the first three of the above parameters. The next 4 parameters are in summary table of CVP, and not saved in the excel sheet. Analysis on parameters like grid size, block size, threads used and sm cta launched etc. is also welcome. But, my major doubts belong to the ones mentioned above.
So, I would like to know “what does a particular value for a particular parameter” signify. Please understand that I am not asking what does a parameter mean. I know the meaning of all the parameters. I would like to know how does the value of a parameter affect the perfromance of the code and which parameters should I use and how do I use them to judge the performance of the codes that I made.
CVP_Sobel.rar (1.76 KB)