How to find out GPU time for executing a particular block of code?

Using python.
How do I find out the time taken for neural network inference?

Thank you

Assuming your question has to do with CUDA programming as suggested by the forum title, you would use the functions described in Appendix B.11 of the Programming Guide.

In ordinary python, I would just do host-based timing.
The only caveat will be to make sure that whatever CUDA processing you have launched from Python is complete, and the specific approach here will depend on exactly what you are doing. For example Numba CUDA python would be different than Tensorflow.

For example, in Numba CUDA python, I would do something like this:

https://stackoverflow.com/questions/45660373/why-is-cuda-jit-python-program-faster-than-its-cuda-c-equivalent/45660697#45660697

I am using Keras with TF backend, in that case how do I find out time taken by GPU for model inference?

Thank you for replying.
I am using Keras with TF backend, in that case how do I find out time taken by GPU for model inference?

I think a google search can give you a pretty good starting point. For example this should generally work:

https://stackoverflow.com/questions/49068469/how-to-measure-execution-time-for-prediction-per-image-keras

Would this give me the GPU time? (and not CPU?)

It will give you the time it takes to execute a particular block of code, including when that code calls GPU activity, which seemed to be what you were asking.

You might wish to explore GPU profilers. They are documented here:

https://docs.nvidia.com/cuda/profiler-users-guide/index.html