Simple benchmark

I am trying to demonstrate various computation advantages using arbitrary python code on the Orin AGX Dev kit, cpu/cuda is easy/standard; however, I am having trouble locating a TPU enabled python library that will let me run some benchmarks. Any ideas on how this can be achieved?
(When I say TPU, I am referencing the 64 on-board tensor-cores)


Tensor core supports IMMA and HMMA operations.
You can deploy some jobs to it via TensorRT or cuDNN library.

TensorRT: Support Matrix :: NVIDIA Deep Learning TensorRT Documentation
cuDNN: Developer Guide :: NVIDIA Deep Learning cuDNN Documentation


Nice! I’ll give it a spin.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.