Hi,
My English isn’t so good so feel free to ask me if there is anything unclear.
Thank you for your assistance always.
We found a problem that occurs when running the Deconvolution Layer on [Jetson Xavier NX + DLA].
(Deconvolution Layer corresponds to ConvTranspose in the case of ONNX.)
I did a test run of the Deconvolution Layer on DLA with the following configuration.
Input Shape : [1 x 32 x 16 x 16] @ NCHW.
Output Shape : [1 x 32 x 32 x 32] @ NCHW.
The weights and input data are all set to 1 for testing.
At this time, we found a problem that the output results are different between GPU and DLA.
The following image shows some of the output values when run on the GPU.
All the values are set to 32, indicating that the Deconvolution Layer is correctly executed.
[Image1. Deconvolution Layer from GPU]
In contrast to this image, the following image shows a portion of the output values when run on DLA under the same conditions.
You can see that the output values are different from those of the GPU.
Also, some of the output results are set to 0, which I think is not the correct behavior.
[Image2. Deconvolution Layer from DLA]
There were no warning or error messages when running TensorRT.
The execution of TensorRT itself completes without any problem, but only the output results are different between GPU and DLA like this.
Is this a kind of bug?
Upload the ONNX model used in the test.
deconv_32x16x16.onnx (16.5 KB)
As a side note, I set the filter size of Deconvolution to 16, and this problem disappears.
(The output values are the same for GPU and DLA).
Input Shape : [1 x 32 x 16 x 16] @ NCHW.
Output Shape : [1 x 16 x 32 x 32] @ NCHW.Therefore, I think the problem is related to the filter size used for Deconvolution.
Sorry, I didn’t mean to go on so long.
Thank you in advance.
Regards,