Profiling GPU at source code level

Hi

Does anyone know of a profiling tool for CUDA that will show you the GPU execution time at the source code level, ie so you can see time spent in each function within a complex kernel?

Many thanks

On Linux: Nvidia Visual Profiler (nvvp) in combination with the -lineinfo compile switch

And on Windows?

Nsight Visual Studio Edition for Windows has built in profiling features.

It seems the NVidia Visual Profiler is in fact a cross platform tool and also available on Windows.

hi sir,
i am not able to see the hotspot and source code along with its path in the kernel profile in the unguided analysis in nvvp
the steps done were

  1. nvcc wrap_number_increased.cu -lineinfo -o wrap_number_increased
  2. nvprof -o wrap_number_increased.nvvp ./wrap_number_increased
  3. nvvp wrap_number_increased.nvvp

The code wrap_number_increased.cu is shown below

#include <cuda.h>
#include <stdio.h>
__global__ void test()
{

int i,j;
   for(i=0;i<10;i++)
        for(j=0;j<100;j++)
                {
//                      printf("Hello gpu\n");

                }

}

int main()
{
printf("Hello from the CPU\n");
test<<<2,32>>>();
cudaError_t err = cudaGetLastError();
if (err != cudaSuccess) {
    printf("CUDA Error: %s\n", cudaGetErrorString(err));
}
printf("next is cudaDeviceSynchronize()\n");
cudaDeviceSynchronize();
return 0;
}

In nvvp GUI at the timeline i have selected the function “test” then-> unguided analysis-> then select the some part of function test timeline → then selected the kernel profile- pc sampling
but it is saying

Unable to create source/assembly view for kernel Profile-PC sampling analysis.
As shown below


Sir did you face this kind of issue?? can you kindly guide me in solving this issue