DNN Samples are not working on host

Software Version
DRIVE OS Linux 5.2.0 and DriveWorks 3.5

Target Operating System
Linux

Hardware Platform
NVIDIA DRIVE™ AGX Pegasus DevKit (E3550)

SDK Manager Version
1.4.0.7363

Host Machine Version
native Ubuntu 18.04 with RTX 5000

Hello, I upgraded my software from Driveworks 3 to 3.5. In version 3, DNN samples were working also my custom implementations with bin files were working. Now I have a problem.
None of the given sample DNN examples are working. Both in my models and official examples it throws an exception.
“Driveworks exception thrown: DW_INTERNAL_ERROR: DNN: Unable to load model.”

What could be the problem here? How can we solve this?

Best regards.

Hi @kaltinok,

As you can see in the perception module document, it states “These modules are available in NVIDIA DRIVE Software releases only.”. You need to wait for upcoming DRIVE Software 11.0. Thanks!

Thank you! Another thing;
In this link DNN, it says This module is available in both NVIDIA DriveWorks and NVIDIA DRIVE Software releases.
So can I expect that my custom .bin files can work only with DriveOS & DriveWorks or should I go back to DRIVE software 10? I asked this because dwDNN_initializeTensorRTFromFile method also throws the same exception with our custom models.

Sorry for my misunderstanding. I thought you were talking about "Perception Samples
".

Could you share your command and the output messages from running any “Deep Neural Network (DNN) Framework Samples”?

Here you can see the outputs:
First image is from the official samples. And second image is from our custom implementation.
First image:

Second image:


Both gives errors.

Also i tested my models as TRT engines in trtexec and they pass correctly

Please take a look at “Q: If I build the engine on one GPU and run the engine on another GPU, will this
work?” in “Chapter 14. Troubleshooting” of “NVIDIA DRIVE OS 5.2.0.0 TensorRT
6.3.1 Developer Guide” (at ~/nvidia/nvidia_sdk/DRIVE_OS_5.2.0_SDK_Linux_OS_DDPX/documentations/drive_os_documentation/NVIDIA_DRIVE_OS_5.2_For_TensorRT_6.3.1_Developer_Guide.pdf on host system).

The problem is standard runtime of TensorRT 6.3.1 wronly treats the warning which should be only on proxy/safety runtime. We already fixed it in TensorRT 6.4

On TensorRT 6.3.1, you need to generate and deserialize your plan file with the same GPU. Thanks!

Actually I use the same GPU and still get those errors. I switched back to drive software 10 and in that version those errors are gone.
Do you consider upgrading TRT version to 6.4 in driveOs 5.2? Drive software 10 has tensorrt 5.1 which is not suitable for me.

Before talking about upgrade, let’s try to clarify the issue you are seeing first. Did you mean you see the error even if using the plan file genereted with RTX 5000 and TensorRT 6.3.1? Thanks.

Yes, exactly as you asked.

Please share the steps of generating your model file on the host system with RTX 5000 and TensorRT 6.1.3. Thanks.

Sure;

  • I freeze the TF 1.14 model and got a .pb file.
  • Converted the pb file to onnx format with tf2onnx.
  • Created .bin file with TRT optimizer coming with driveworks. In addition to that I saved also the model as TRT engine.
  • Run the TRT engine with trtexec to test. And the model passes correctly.
  • Finally I tried to pass the bin file to initializeTensorRTFromFile method and got those errors above.
    When I switch back to previous release there is no error in that method. I think initializeTensorRTFromFile method cannot read the .bin path or something.
    Thank you.

Dear @kaltinok ,
Could you please share your onnx model file to reproduce it on our end?

Here is the bug.
https://nvbugs/3175027 [VCC-SPA2-Zenuity] TensorRT 6.3.1 - ERROR: Using an engine plan file across different models of devices is not recommended

Here you can find a sample as mnist.pb and mnist.onnx which is converted from pb.
mnist.onnx (40.5 KB) mnist.pb (48.3 KB)

Dear @kaltinok ,
I could load the model successfully. I have added a print statement and exit(0) after dwDNN_initializeTensorRTFromFile() in sample_dnn_tensor. to verify model loading
These are the steps followed.

  • Generate DW compatible model using TensorRT_Optimization tool. it generates optimization.bin in current directory
    /usr/local/driveworks-3.5/tools/dnn/tensorRT_optimization --modelType=onnx --onnxFile=/path/to/onnxmodel

  • load the sample with new model
    ./sample_dnn_tensor --tensorRT_model=/path/to/optimized.bin