I can't show the gpu details about memory throughput

And my result of profiling

In the picture above, the throughput for ‘Memcpy’ is written, but the throughput for one kernel is not written.

What do I have to do to get the throughput of the kernel?

And

The meaning of ‘throughput’ is global memory throughput, right?
also, global memory throughput means effective global memory bandwidth in https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html#effective-bandwidth-calculation ?