DNN Inference PX2

bdev · October 31, 2018, 12:28pm

Hi everyone,

for a project I want to build an application for the PX 2 using an own neural network.
The network is fed with images from a camera or video.
My problem is that the output of my network is not what I expect.
This is how I run the NN:

dwDNN_infer(dnn_Output, &dnn_Input, dnn);
cudaMemcpy(dnn_Output_H[0].get(), dnn_Output[0], sizeof(float32_t) * output_size[0], cudaMemcpyDeviceToHost));

The values of dnn_Output_H[0] are not at all what I expect as an output from the network.
Am I missing something with the dwDNN_infer call? Is dnn_Output_H[0] the right variable to look at, or is it possible that the error is already in my NN?

Happy for any suggestions!

SteveNV · October 31, 2018, 1:23pm

Dear bdevm

Did you review “DNN Workflow” part(Development Guide → DNNs → DNN Workflow) in DriveWorks doc?

bdev · November 5, 2018, 8:13am

Dear StenvNV,

yes I did look into it.
Am I correct, that in the results (output of my network) is stored in dnnOutputHost?

// Enqueue asynchronous copy of the inference results to host memory
cudaMemcpyAsync(dnnOutputHost[output1Index].data(), dnnOutputs[output1Index], sizeof(float32_t) * numElements1, cudaMemcpyDeviceToHost);
cudaMemcpyAsync(dnnOutputHost[output2Index].data(), dnnOutputs[output2Index], sizeof(float32_t) * numElements2, cudaMemcpyDeviceToHost);

If that is the case something is wrong with my network :/

servanti · November 9, 2018, 4:39pm

Hi bdevm

We had this exact same problem a while back, where the results we got from our TensorFlow models did not mach the results from either DriveWorks DNN or running TensorRT directly via C++ interface. In our case, with a model that took one input (image from camera feed, a sekonix ar231-rccb) we ended up writing our own cuda pipeline to “prepare” the image (resize-crop) but the main thing was the conversion of the image from HWC to CHW format. Once we used our own cuda kernel for CHW conversion and/or openCV for resize and crop, matching the training, and converted the image (with DriveWorks we bypassed dwDataConditioner_prepareData and prepared the input ourselves) we saw exact results in both DriveWorks and tensorRT on DPX2. We also noticed that it is important if image is resized for training, the same filter is used to resize it during inference, although this produces results that are slightly different than TF. We also noticed openCV resize with exact same filters on the CPU and GPU are slightly different, so we had to stay consistent. Anyway, this resolved our issues and now we can get exact results as TensorFlow in both DriveWorks and straight tensorRT.

imugly1029 · January 6, 2019, 5:40pm

I face the same problem([url]https://devtalk.nvidia.com/default/topic/1045548/yolo-tensorrt-model-on-sample_object_detector-with-weird-bounding-boxes/#5305269[/url]) during use YOLO V1 as my dnn network. I think that’s nothing to do with network’s error, it’s about the drivewoks inference. I found it’s weird in dwdnn_infer or dwdnn_inferSIO’s output. Anyone knows the output’s content?

By the way, deer servanti mentions the CUDA’s pipeline. I want to make sure that dwDataConditioner_prepareData shouldn’t be used in code, right? And the original image format is HWC not CHW? So I have to write my own CUDA pipeline to make the conversion?

servanti · January 8, 2019, 4:48pm

Hi imugly1029

As far as our tests showed the dwDataConditioner_prepareData did not do the HWC->CHW conversion but I do not know the internal code some maybe some one from NVIDIA can comment on this. But since we wanted to run both through DriveWorks API and just C++/TensorRT without DriveWorks we wrote our own CUDA kernels to resize/crop etc. and convert to CHW , we also developed another path of using OpenCV for resizing and cropping but we still do the CHW and conversion to float ourselves.

We have also compiled TensorFlow for aarch64 and can run inference with python. In that case NumPy has a reshape function that does the CHW conversion , but we are more interested in C++/tensorRT for performance reasons. TensorFlow was just an “intermediate” step.

imugly1029 · January 9, 2019, 3:54am

Hi servanti!
Very appreciate for the reply!
So it means that each frame originally is HWC format and we need to convert to CHW for inference right?

By the way, the tensorRT model transfer from tensorRT_optimization is correct right? I mean, the parameters inside the model should be the same as the original caffe/tensorflow model?

Topic		Replies	Views
Inference of model using tensorflow/onnxruntime and TensorRT gives different result Jetson TX2 tensorrt	20	2843	October 18, 2021
Dnn problem on px2 DriveWorks	7	1061	October 12, 2021
How to read an camera image as CHW in Driveworks? DRIVE AGX Xavier General driveworks-dnn-framework	3	874	October 12, 2021
Nvinfer input formats issue Jetson Nano jetson-inference	23	2620	October 12, 2021
Issue running Driveworks DNN on custom model DRIVE AGX Xavier General driveworks-dnn-framework	18	1599	October 19, 2021
Always same output vlaues TensorRT	10	2508	October 12, 2021
Specifications of dwDataConditioner DRIVE AGX Orin General driveworks-dnn-framework	11	472	June 14, 2024
DNN initialization Error : Simple NVCaffe pretrained model does not run on Driveworks DriveWorks	2	869	July 13, 2017
tensorflow model implementation on PX2(round two) DriveWorks	3	1185	May 9, 2018
How to use multiple images as the input of DNN on PX2? DriveWorks	2	757	December 13, 2018

DNN Inference PX2

Related topics