I’m trying to run a segmentation network, the result of which is a 8x3x224x224 tensor.
This takes ~500ms, which seems excessive (this is the time it takes the .cpu() function to run, as measured by cProfile).
Is there a way to reduce this?
Thanks
I’m trying to run a segmentation network, the result of which is a 8x3x224x224 tensor.
This takes ~500ms, which seems excessive (this is the time it takes the .cpu() function to run, as measured by cProfile).
Is there a way to reduce this?
Thanks
Hi,
Have you maximized the device performance?
It should improve the data transfer performance.
$ sudo nvpmodel -m 0
$ sudo jetson_clocks
Thanks.
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.