I have an executable which I am using to extract the data related to runtime of all the CUDA kernels that are involved in my application.
Looking at the data, I see that there are many of your internal kernels (for example pooling_fw_4d_kernel, offset_vector_kernel(float*, float*, float*, long, long, long, bool, int, int), etc.).
My question here is: If I have my kernel, say “myKernel” being invoked in the application at some point, and the data regarding its start and end (time in ns) can be found; does this start and end time take into account the internal kernels being used?
Also, in the CUPTI_ACTIVITY_KIND_KERNEL table from the sqlite report, I see that for any of my own kernels in my application appears twice in the table StringIds (when I try to match the name of the kernel with the string id shortName). Like I would see something like this:
In the table StringIds:
id || value
789 || myKernel(int* , double const*)
790 || myKernel
From above snippet, which entry do I use to get exact information about the time to get the runtime of the kernel that I am looking for.
If I am looking for the start point in runtime of my application where myKernel started execution, including all the internal kernel calls that cuDNN or cuBLAS might have made, till the end when the kernel and all related activities to it were stopped (the end time of that kernel), which entry should I be using?
Please point me to the correct resource for this!