I’m beginning to play around with the AGX Xavier. I’d like to start accelerating inferencing using the NVIDIA Deep Learning Accelerators. I understand that I must use TensorRT, and I understand how to use the TF-TRT library to convert TensorFlow models into TensorRT.
However, how can I explicitly control where the inference is computed (ie: CPU, GPU or NVDLA)? Are TensorRTs models always executed on NVDLAs?
You can choose the deployed hardware when converting the model into TensorRT engine.
The API looks like as following:
builder->setDefaultDeviceType( nvinfer1::DeviceType::kDLA );
The default device type is set to GPU rather than DLA.
I enrolled in the “Optimization and Deployment of TensorFlow models with TensorRT” course, how do I choose what hardware to use with the TF-TRT converter?
Unfortunately, TF-TRT doesn’t support DLA.
You will need to use standalone TensorRT for deploying a model on the DLA.