Hi
I want to measure the time of ONLY the inference in the TX2. How can I improve my function to do that? As right now I am measuring:
Or is that not possible because of the way GPUs work?
THIS WHERE I HAVE SET THE TIMER
THIS IS THE “do_inference” FUNCTION
Thank you
Hi @Aizzaac, you would put the timing around the context.execute()
call. That is what is actually running the inferencing with TensorRT.
@dusty_nv
Can you give me an example, please?
The function, I have it in Inference.py
And the timer I have it in Timer.py
I haven’t used that exact timer before, but I think you would want to add import time
to the top of your Inference.py
.
And then psuedocode for in your do_inference()
function:
cuda.memcpy_htod_async(...)
start = time.perf_counter()
context.execute(...)
inference_time = time.perf_counter() - start
print("INFERENCE TIME")
print(inference_time * 1000)
cuda.memcpy_dtoh_async(...)
1 Like