I’m beginning to play around with the AGX Xavier. I’d like to start accelerating inferencing using the NVIDIA Deep Learning Accelerators. I understand that I must use TensorRT, and I understand how to use the TF-TRT library to convert TensorFlow models into TensorRT.
However, how can I explicitly control where the inference is computed (ie: CPU, GPU or NVDLA)? Are TensorRTs models always executed on NVDLAs?