I am processing an image at GPU using CUDA streams. I have divided my image into three smaller segments and then I transfer these smaller segments using three different CUDA streams. These 3 streams call the kernel for image processing and these 3 streams copies back the processed data back to the CPU.
I want to see, if this streaming scheme is really helping me. I want to see the time graphs of each event. I think that Nsight can help me in that. I am really not sure which tool should be used as I am new to CUDA. Please tell me about it.
I need something like below: