Is there a TensorRT sample that can run entirely on DLA w/o fallback to GPU?


I’m looking for an example for object detection that can run entirely on the DLA.
I was following the instruction for using sampleSSD with the VGG data set.
I’m able to compile and run the example both on an x86 PC with a GPU, and on the Jetson Xavier with JetPack 4.3.

When the example is run on the Xavier with the --useDLACore option, I’m getting an output to the screen saying that some of the layers cannot be run on the DLA and that a fallback to the GPU will take place instead.

  • Is there any known TRT sample code that is known to be run entirely on the DLA ?
  • If not, is the GPU able to run other types of inference in parallel (say that my software does other types of deep learning tasks besides object detection) ?
  • Is there any TRT sample code that demonstrates exercising both DLA’s at the same time (say that my software has more than one video channel to perform object detection on) ?


Moving this to the Xavier forum so the Jetson team can take a look


You can remove the final prob layer of GoogleNet and then it can run with DLA directly.

2. Sure.

3. Sorry that we don’t have a sample for DLA.
But you can modify our sample to use DLA and use different DLA with two applications.



Thank you for getting back to.

I did try to run the GoogleNet example, but according to the documentation, even though the network is designed for classification and detection and was trained using ImageNet, the sample code performs inference on empty data (all zeros) and not actual images.

If I want to merge some of the code from sampleSSD into GoogleNet sample:

  1. What type of images should I use?
  2. Any specific dimensions that I should use?

In more general terms, If I remove the prob layer:

  1. How will it affect the outcome of the network?
  2. Will it still be able to perform its original purpose of classification and detection?
  3. Do I need to re-train the network?



If I remove the prob layer from the model and try to run the GoogleNet sample it crashes with segmentation fault.
It seems that simply removing the layer by itself is not enough.


In general, TensorRT takes RGB float32 GPU buffer as input.
There is no limitation in the input image type.
But you will need to have the corresponding image decoder for pre-processing.

Dimension depends on the network.
Please check the input layer of model for the detail.

For example, GoogleNet takes 3x224x224 as input.

The last operation is softmax layer which mapping the network value into probability.
This won’t change the magnitude order so it the classification result will remain identical.

If your use case is different from the imageNet database, which GoogleNet trained on, it’s recommended to retrain the model.

You will need to update the output layer name into the loss3/classifier since the prob layer is removed.