Output incorrect with odd number of channels

We have a unet, running on jetson nano using tensorrt, converted to trtmodel.cache using ConvertCaffeToTrtModel.
The final class probability output is completely wrong, seems to be all noise.
If there any limitation of the width/height/channel requirements ?
We are using 3 channels outputs on the final 2d convolutions.

Hi,

3 channel output should be fine.
The limitation is that the output dimension must beyond 4:
[url]Developer Guide :: NVIDIA Deep Learning TensorRT Documentation

Could you share your model/sample with us so we can reproduce this issue on our side?
Thanks.

Might I have your company email address.

We had final output dimension 3x120x160
(CxHxW)

Hi,

Would you mind to send me a private message directly?
Thanks.

Hi,

Could you try to add the format converted before/after the inferences?
For example, with python API:

...
cuda.memcpy_htod_async(d_image, np.copy(batch.transpose(2,0,1), 'C'), stream)
context.enqueue(1, bindings, stream.handle, None)
cuda.memcpy_dtoh_async(logits, d_logits, stream)
stream.synchronize()

prob = softmax_prob(logits.transpose(1,2,0))
frame = cv2.cvtColor(prob * 255, cv2.COLOR_GRAY2BGR).astype(np.uint8)
...

Thanks.

Since the input is only one channel, it is the same for CHW or HWC format.

We already use :
// DMA the input to the GPU, execute the batch asynchronously, and DMA it back:
CHECK(cudaMemcpyAsync(h->cuda_buffers[h->inputIndex], h->input_float_data, BATCHSIZE * INPUT_H * INPUT_W * sizeof(float), cudaMemcpyHostToDevice, h->stream));
h->context->enqueue(BATCHSIZE, h->cuda_buffers, h->stream, nullptr);
CHECK(cudaMemcpyAsync(h->output_float_data, h->cuda_buffers[h->outputIndex], BATCHSIZE * OUTPUT_SIZE * sizeof(float), cudaMemcpyDeviceToHost, h->stream));
cudaStreamSynchronize(h->stream);

Please run and compare the result I send previously using the caffe python script with your result.
Thanks.

Hi,

We want to reproduce this issue internally.
Would you mind to share your implementation with us so we can report it to our internal team?

Thanks.

I cannot share the code with u , since it is in C and embedded with our application.
Have u even try the model out on your python script to test it ?
Or pls send me a python script that can run a model, I can modify to run this model and show you the result.

Hi,

Any update on that ? BTW, I cannot find any python support on Jetson Nano also.
We are currently using c++ on tensorrt.

Hi,

Sorry for the late update.

We have meet an incorrect output error on UNet before and fixed by adding a format converter.
But it looks like you have a different issue.

We will implement a code to reproduce this issue but it may takes time due to our limited bandwidth.
There is apython example for caffemodel. Would you mind to write one for your model?
/usr/src/tensorrt/samples/python/fc_plugin_caffe_mnist/

Thanks.

Can u confirm that I can test it with the tensorrt in the jetson nano ?
We cannot test the code written in python in jetson nano, since the it is not supported in jetson nano.
The only interface tht support in the jetson nano is c++ api.

Hi,

TensorRT Python API is available on Jetson Nano.
What’s error do you meet?

Thanks.

The Jetson Nano does not have pycuda in its official release image, which is required for tensorrt python api to run.
Have u try to run it on Jetson Nano ?

Hi,

YES. pyCUDA can be installed with pip3:

pip3 install pycuda --user

Thanks.

Had u even tried it on the jetson nano with the nvidia image?
Pls try it yourself before replying.
First, there is no pip3 in the jetson nano image, only pip and pip2.
Secondly, apt-cache search pip3 return nothing.

So unless you’re using an internal unreleased image, you should have no pycuda in jetson nano .

BTW we are already using the latest nvidia release image (jetson-nano-sd-r32.1-2019-03-18.zip)

Hi,

You can install pip3 with the following command:

sudo apt-get install python3-pip

Then install pyCUDA with this:

pip3 install numpy pycuda --user

Thanks.

I remember i tried that before.
Again did u even try it on your side on the Jetson Nano h/w before you post/reply ?
I am using the official image:

Error:
pip3 install numpy pycuda --user
copying pycuda/_cluda.py → build/lib.linux-aarch64-3.6/pycuda
copying pycuda/scan.py → build/lib.linux-aarch64-3.6/pycuda
copying pycuda/tools.py → build/lib.linux-aarch64-3.6/pycuda
copying pycuda/_mymako.py → build/lib.linux-aarch64-3.6/pycuda
copying pycuda/init.py → build/lib.linux-aarch64-3.6/pycuda
copying pycuda/cumath.py → build/lib.linux-aarch64-3.6/pycuda
copying pycuda/compiler.py → build/lib.linux-aarch64-3.6/pycuda
creating build/lib.linux-aarch64-3.6/pycuda/gl
copying pycuda/gl/autoinit.py → build/lib.linux-aarch64-3.6/pycuda/gl
copying pycuda/gl/init.py → build/lib.linux-aarch64-3.6/pycuda/gl
creating build/lib.linux-aarch64-3.6/pycuda/sparse
copying pycuda/sparse/coordinate.py → build/lib.linux-aarch64-3.6/pycuda/sparse
copying pycuda/sparse/pkt_build.py → build/lib.linux-aarch64-3.6/pycuda/sparse
copying pycuda/sparse/cg.py → build/lib.linux-aarch64-3.6/pycuda/sparse
copying pycuda/sparse/inner.py → build/lib.linux-aarch64-3.6/pycuda/sparse
copying pycuda/sparse/packeted.py → build/lib.linux-aarch64-3.6/pycuda/sparse
copying pycuda/sparse/init.py → build/lib.linux-aarch64-3.6/pycuda/sparse
copying pycuda/sparse/operator.py → build/lib.linux-aarch64-3.6/pycuda/sparse
creating build/lib.linux-aarch64-3.6/pycuda/compyte
copying pycuda/compyte/array.py → build/lib.linux-aarch64-3.6/pycuda/compyte
copying pycuda/compyte/dtypes.py → build/lib.linux-aarch64-3.6/pycuda/compyte
copying pycuda/compyte/init.py → build/lib.linux-aarch64-3.6/pycuda/compyte
running egg_info
writing pycuda.egg-info/PKG-INFO
writing dependency_links to pycuda.egg-info/dependency_links.txt
writing requirements to pycuda.egg-info/requires.txt
writing top-level names to pycuda.egg-info/top_level.txt
reading manifest file ‘pycuda.egg-info/SOURCES.txt’
reading manifest template ‘MANIFEST.in’
warning: no files found matching ‘doc/source/_static/.css’
warning: no files found matching 'doc/source/_templates/
.html’
warning: no files found matching ‘.cpp’ under directory ‘bpl-subset/bpl_subset/boost’
warning: no files found matching '
.html’ under directory ‘bpl-subset/bpl_subset/boost’
warning: no files found matching ‘.inl’ under directory ‘bpl-subset/bpl_subset/boost’
warning: no files found matching '
.txt’ under directory ‘bpl-subset/bpl_subset/boost’
warning: no files found matching ‘.h’ under directory ‘bpl-subset/bpl_subset/libs’
warning: no files found matching '
.ipp’ under directory ‘bpl-subset/bpl_subset/libs’
warning: no files found matching ‘*.pl’ under directory ‘bpl-subset/bpl_subset/libs’
writing manifest file ‘pycuda.egg-info/SOURCES.txt’
creating build/lib.linux-aarch64-3.6/pycuda/cuda
copying pycuda/cuda/pycuda-complex-impl.hpp → build/lib.linux-aarch64-3.6/pycuda/cuda
copying pycuda/cuda/pycuda-complex.hpp → build/lib.linux-aarch64-3.6/pycuda/cuda
copying pycuda/cuda/pycuda-helpers.hpp → build/lib.linux-aarch64-3.6/pycuda/cuda
copying pycuda/sparse/pkt_build_cython.pyx → build/lib.linux-aarch64-3.6/pycuda/sparse
running build_ext
building ‘_driver’ extension
creating build/temp.linux-aarch64-3.6
creating build/temp.linux-aarch64-3.6/src
creating build/temp.linux-aarch64-3.6/src/cpp
creating build/temp.linux-aarch64-3.6/src/wrapper
creating build/temp.linux-aarch64-3.6/bpl-subset
creating build/temp.linux-aarch64-3.6/bpl-subset/bpl_subset
creating build/temp.linux-aarch64-3.6/bpl-subset/bpl_subset/libs
creating build/temp.linux-aarch64-3.6/bpl-subset/bpl_subset/libs/python
creating build/temp.linux-aarch64-3.6/bpl-subset/bpl_subset/libs/python/src
creating build/temp.linux-aarch64-3.6/bpl-subset/bpl_subset/libs/python/src/object
creating build/temp.linux-aarch64-3.6/bpl-subset/bpl_subset/libs/python/src/converter
creating build/temp.linux-aarch64-3.6/bpl-subset/bpl_subset/libs/system
creating build/temp.linux-aarch64-3.6/bpl-subset/bpl_subset/libs/system/src
creating build/temp.linux-aarch64-3.6/bpl-subset/bpl_subset/libs/smart_ptr
creating build/temp.linux-aarch64-3.6/bpl-subset/bpl_subset/libs/smart_ptr/src
creating build/temp.linux-aarch64-3.6/bpl-subset/bpl_subset/libs/thread
creating build/temp.linux-aarch64-3.6/bpl-subset/bpl_subset/libs/thread/src
creating build/temp.linux-aarch64-3.6/bpl-subset/bpl_subset/libs/thread/src/pthread
aarch64-linux-gnu-gcc -pthread -fwrapv -Wall -O3 -DNDEBUG -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DBOOST_ALL_NO_LIB=1 -DBOOST_THREAD_BUILD_DLL=1 -DBOOST_MULTI_INDEX_DISABLE_SERIALIZATION=1 -DBOOST_PYTHON_SOURCE=1 -Dboost=pycudaboost -DBOOST_THREAD_DONT_USE_CHRONO=1 -DPYGPU_PACKAGE=pycuda -DPYGPU_PYCUDA=1 -DHAVE_CURAND=1 -Isrc/cpp -Ibpl-subset/bpl_subset -I/tmp/pip-build-tuyn729n/pycuda/.eggs/numpy-1.16.4-py3.6-linux-aarch64.egg/numpy/core/include -I/usr/include/python3.6m -c src/cpp/cuda.cpp -o build/temp.linux-aarch64-3.6/src/cpp/cuda.o
In file included from src/cpp/cuda.cpp:1:0:
src/cpp/cuda.hpp:14:10: fatal error: cuda.h: No such file or directory
#include <cuda.h>
^~~~~~~~
compilation terminated.
error: command ‘aarch64-linux-gnu-gcc’ failed with exit status 1


Failed building wheel for pycuda
Running setup.py clean for pycuda
Running setup.py bdist_wheel for mako … done
Stored in directory: /home/keith/.cache/pip/wheels/82/59/50/e9c6d83cf76c5f09e2f3eb976e38d4d5578ed37585e960a150
Running setup.py bdist_wheel for MarkupSafe … done
Stored in directory: /home/keith/.cache/pip/wheels/f2/aa/04/0edf07a1b8a5f5f1aed7580fffb69ce8972edc16a505916a77
Successfully built numpy mako MarkupSafe
Failed to build pycuda
Installing collected packages: numpy, appdirs, decorator, MarkupSafe, mako, setuptools, six, atomicwrites, more-itertools, zipp, importlib-metadata, pluggy, wcwidth, py, attrs, pytest, pytools, pycuda

Hi,

We do try it before the reply. It can work well and we also release a sample for it:

Have you modified the LIBRARY_PATH path?
It looks like the compiler cannot find the CUDA path correctly.
Could you try to export the parameter?

export LIBRARY_PATH=$LIBRARY_PATH:/usr/local/cuda/lib64

If it is still not working, would you mind to reflash your system and run the following:

$ sudo apt-get install python3-pip
$ pip3 install pycuda --user

Thanks.