Timing for NVSHMEM program

Hi, All

How could I do timing on a NVSHMEM program?
Since there are multiple threads, put the timer on each PE will return multiple time measurements.

Should I just do MAX(timePE1, timePE2, ...., timePEn) to get the overall kernel execution time for NVSHMEM?

Thanks

Sure. You can do that - report the max across PEs. Or just report PE0 measurement - if depends what you are trying to measure…