Output incorrect with odd number of channels

ek9852 · May 2, 2019, 9:28pm

We have a unet, running on jetson nano using tensorrt, converted to trtmodel.cache using ConvertCaffeToTrtModel.
The final class probability output is completely wrong, seems to be all noise.
If there any limitation of the width/height/channel requirements ?
We are using 3 channels outputs on the final 2d convolutions.

AastaLLL · May 3, 2019, 6:06am

Hi,

3 channel output should be fine.
The limitation is that the output dimension must beyond 4:
[url]Developer Guide :: NVIDIA Deep Learning TensorRT Documentation

Could you share your model/sample with us so we can reproduce this issue on our side?
Thanks.

ek9852 · May 3, 2019, 6:12am

Might I have your company email address.

ek9852 · May 3, 2019, 6:13am

We had final output dimension 3x120x160
(CxHxW)

AastaLLL · May 6, 2019, 12:27am

Hi,

Would you mind to send me a private message directly?
Thanks.

AastaLLL · May 16, 2019, 7:57am

Hi,

Could you try to add the format converted before/after the inferences?
For example, with python API:

...
cuda.memcpy_htod_async(d_image, np.copy(batch.transpose(2,0,1), 'C'), stream)
context.enqueue(1, bindings, stream.handle, None)
cuda.memcpy_dtoh_async(logits, d_logits, stream)
stream.synchronize()

prob = softmax_prob(logits.transpose(1,2,0))
frame = cv2.cvtColor(prob * 255, cv2.COLOR_GRAY2BGR).astype(np.uint8)
...

Thanks.

ek9852 · May 16, 2019, 4:15pm

Since the input is only one channel, it is the same for CHW or HWC format.

We already use :
// DMA the input to the GPU, execute the batch asynchronously, and DMA it back:
CHECK(cudaMemcpyAsync(h->cuda_buffers[h->inputIndex], h->input_float_data, BATCHSIZE * INPUT_H * INPUT_W * sizeof(float), cudaMemcpyHostToDevice, h->stream));
h->context->enqueue(BATCHSIZE, h->cuda_buffers, h->stream, nullptr);
CHECK(cudaMemcpyAsync(h->output_float_data, h->cuda_buffers[h->outputIndex], BATCHSIZE * OUTPUT_SIZE * sizeof(float), cudaMemcpyDeviceToHost, h->stream));
cudaStreamSynchronize(h->stream);

Please run and compare the result I send previously using the caffe python script with your result.
Thanks.

AastaLLL · May 17, 2019, 6:44am

Hi,

We want to reproduce this issue internally.
Would you mind to share your implementation with us so we can report it to our internal team?

Thanks.

ek9852 · May 17, 2019, 4:25pm

I cannot share the code with u , since it is in C and embedded with our application.
Have u even try the model out on your python script to test it ?
Or pls send me a python script that can run a model, I can modify to run this model and show you the result.

ek9852 · May 21, 2019, 11:22pm

Hi,

Any update on that ? BTW, I cannot find any python support on Jetson Nano also.
We are currently using c++ on tensorrt.

AastaLLL · May 28, 2019, 5:48am

Hi,

Sorry for the late update.

We have meet an incorrect output error on UNet before and fixed by adding a format converter.
But it looks like you have a different issue.

We will implement a code to reproduce this issue but it may takes time due to our limited bandwidth.
There is apython example for caffemodel. Would you mind to write one for your model?
/usr/src/tensorrt/samples/python/fc_plugin_caffe_mnist/

Thanks.

ek9852 · May 28, 2019, 10:34pm

Can u confirm that I can test it with the tensorrt in the jetson nano ?
We cannot test the code written in python in jetson nano, since the it is not supported in jetson nano.
The only interface tht support in the jetson nano is c++ api.

AastaLLL · May 29, 2019, 2:07am

Hi,

TensorRT Python API is available on Jetson Nano.
What’s error do you meet?

Thanks.

ek9852 · May 29, 2019, 2:48am

The Jetson Nano does not have pycuda in its official release image, which is required for tensorrt python api to run.
Have u try to run it on Jetson Nano ?

AastaLLL · May 30, 2019, 2:54am

Hi,

YES. pyCUDA can be installed with pip3:

pip3 install pycuda --user

Thanks.

ek9852 · May 30, 2019, 3:02am

Had u even tried it on the jetson nano with the nvidia image?
Pls try it yourself before replying.
First, there is no pip3 in the jetson nano image, only pip and pip2.
Secondly, apt-cache search pip3 return nothing.

So unless you’re using an internal unreleased image, you should have no pycuda in jetson nano .

ek9852 · May 30, 2019, 3:04am

BTW we are already using the latest nvidia release image (jetson-nano-sd-r32.1-2019-03-18.zip)

AastaLLL · May 30, 2019, 7:34am

Hi,

You can install pip3 with the following command:

sudo apt-get install python3-pip

Then install pyCUDA with this:

pip3 install numpy pycuda --user

Thanks.

ek9852 · May 30, 2019, 3:28pm

I remember i tried that before.
Again did u even try it on your side on the Jetson Nano h/w before you post/reply ?
I am using the official image:

Error:
pip3 install numpy pycuda --user
copying pycuda/_cluda.py → build/lib.linux-aarch64-3.6/pycuda
copying pycuda/scan.py → build/lib.linux-aarch64-3.6/pycuda
copying pycuda/tools.py → build/lib.linux-aarch64-3.6/pycuda
copying pycuda/_mymako.py → build/lib.linux-aarch64-3.6/pycuda
copying pycuda/init.py → build/lib.linux-aarch64-3.6/pycuda
copying pycuda/cumath.py → build/lib.linux-aarch64-3.6/pycuda
copying pycuda/compiler.py → build/lib.linux-aarch64-3.6/pycuda
creating build/lib.linux-aarch64-3.6/pycuda/gl
copying pycuda/gl/autoinit.py → build/lib.linux-aarch64-3.6/pycuda/gl
copying pycuda/gl/init.py → build/lib.linux-aarch64-3.6/pycuda/gl
creating build/lib.linux-aarch64-3.6/pycuda/sparse
copying pycuda/sparse/coordinate.py → build/lib.linux-aarch64-3.6/pycuda/sparse
copying pycuda/sparse/pkt_build.py → build/lib.linux-aarch64-3.6/pycuda/sparse
copying pycuda/sparse/cg.py → build/lib.linux-aarch64-3.6/pycuda/sparse
copying pycuda/sparse/inner.py → build/lib.linux-aarch64-3.6/pycuda/sparse
copying pycuda/sparse/packeted.py → build/lib.linux-aarch64-3.6/pycuda/sparse
copying pycuda/sparse/init.py → build/lib.linux-aarch64-3.6/pycuda/sparse
copying pycuda/sparse/operator.py → build/lib.linux-aarch64-3.6/pycuda/sparse
creating build/lib.linux-aarch64-3.6/pycuda/compyte
copying pycuda/compyte/array.py → build/lib.linux-aarch64-3.6/pycuda/compyte
copying pycuda/compyte/dtypes.py → build/lib.linux-aarch64-3.6/pycuda/compyte
copying pycuda/compyte/init.py → build/lib.linux-aarch64-3.6/pycuda/compyte
running egg_info
writing pycuda.egg-info/PKG-INFO
writing dependency_links to pycuda.egg-info/dependency_links.txt
writing requirements to pycuda.egg-info/requires.txt
writing top-level names to pycuda.egg-info/top_level.txt
reading manifest file ‘pycuda.egg-info/SOURCES.txt’
reading manifest template ‘MANIFEST.in’
warning: no files found matching ‘doc/source/_static/.css’
warning: no files found matching 'doc/source/_templates/.html’
warning: no files found matching ‘.cpp’ under directory ‘bpl-subset/bpl_subset/boost’
warning: no files found matching '.html’ under directory ‘bpl-subset/bpl_subset/boost’
warning: no files found matching ‘.inl’ under directory ‘bpl-subset/bpl_subset/boost’
warning: no files found matching '.txt’ under directory ‘bpl-subset/bpl_subset/boost’
warning: no files found matching ‘.h’ under directory ‘bpl-subset/bpl_subset/libs’
warning: no files found matching '.ipp’ under directory ‘bpl-subset/bpl_subset/libs’
warning: no files found matching ‘*.pl’ under directory ‘bpl-subset/bpl_subset/libs’
writing manifest file ‘pycuda.egg-info/SOURCES.txt’
creating build/lib.linux-aarch64-3.6/pycuda/cuda
copying pycuda/cuda/pycuda-complex-impl.hpp → build/lib.linux-aarch64-3.6/pycuda/cuda
copying pycuda/cuda/pycuda-complex.hpp → build/lib.linux-aarch64-3.6/pycuda/cuda
copying pycuda/cuda/pycuda-helpers.hpp → build/lib.linux-aarch64-3.6/pycuda/cuda
copying pycuda/sparse/pkt_build_cython.pyx → build/lib.linux-aarch64-3.6/pycuda/sparse
running build_ext
building ‘_driver’ extension
creating build/temp.linux-aarch64-3.6
creating build/temp.linux-aarch64-3.6/src
creating build/temp.linux-aarch64-3.6/src/cpp
creating build/temp.linux-aarch64-3.6/src/wrapper
creating build/temp.linux-aarch64-3.6/bpl-subset
creating build/temp.linux-aarch64-3.6/bpl-subset/bpl_subset
creating build/temp.linux-aarch64-3.6/bpl-subset/bpl_subset/libs
creating build/temp.linux-aarch64-3.6/bpl-subset/bpl_subset/libs/python
creating build/temp.linux-aarch64-3.6/bpl-subset/bpl_subset/libs/python/src
creating build/temp.linux-aarch64-3.6/bpl-subset/bpl_subset/libs/python/src/object
creating build/temp.linux-aarch64-3.6/bpl-subset/bpl_subset/libs/python/src/converter
creating build/temp.linux-aarch64-3.6/bpl-subset/bpl_subset/libs/system
creating build/temp.linux-aarch64-3.6/bpl-subset/bpl_subset/libs/system/src
creating build/temp.linux-aarch64-3.6/bpl-subset/bpl_subset/libs/smart_ptr
creating build/temp.linux-aarch64-3.6/bpl-subset/bpl_subset/libs/smart_ptr/src
creating build/temp.linux-aarch64-3.6/bpl-subset/bpl_subset/libs/thread
creating build/temp.linux-aarch64-3.6/bpl-subset/bpl_subset/libs/thread/src
creating build/temp.linux-aarch64-3.6/bpl-subset/bpl_subset/libs/thread/src/pthread
aarch64-linux-gnu-gcc -pthread -fwrapv -Wall -O3 -DNDEBUG -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DBOOST_ALL_NO_LIB=1 -DBOOST_THREAD_BUILD_DLL=1 -DBOOST_MULTI_INDEX_DISABLE_SERIALIZATION=1 -DBOOST_PYTHON_SOURCE=1 -Dboost=pycudaboost -DBOOST_THREAD_DONT_USE_CHRONO=1 -DPYGPU_PACKAGE=pycuda -DPYGPU_PYCUDA=1 -DHAVE_CURAND=1 -Isrc/cpp -Ibpl-subset/bpl_subset -I/tmp/pip-build-tuyn729n/pycuda/.eggs/numpy-1.16.4-py3.6-linux-aarch64.egg/numpy/core/include -I/usr/include/python3.6m -c src/cpp/cuda.cpp -o build/temp.linux-aarch64-3.6/src/cpp/cuda.o
In file included from src/cpp/cuda.cpp:1:0:
src/cpp/cuda.hpp:14:10: fatal error: cuda.h: No such file or directory
#include <cuda.h>
^~~~~~~~
compilation terminated.
error: command ‘aarch64-linux-gnu-gcc’ failed with exit status 1

Failed building wheel for pycuda
Running setup.py clean for pycuda
Running setup.py bdist_wheel for mako … done
Stored in directory: /home/keith/.cache/pip/wheels/82/59/50/e9c6d83cf76c5f09e2f3eb976e38d4d5578ed37585e960a150
Running setup.py bdist_wheel for MarkupSafe … done
Stored in directory: /home/keith/.cache/pip/wheels/f2/aa/04/0edf07a1b8a5f5f1aed7580fffb69ce8972edc16a505916a77
Successfully built numpy mako MarkupSafe
Failed to build pycuda
Installing collected packages: numpy, appdirs, decorator, MarkupSafe, mako, setuptools, six, atomicwrites, more-itertools, zipp, importlib-metadata, pluggy, wcwidth, py, attrs, pytest, pytools, pycuda

AastaLLL · May 31, 2019, 1:55am

Hi,

We do try it before the reply. It can work well and we also release a sample for it:

Have you modified the LIBRARY_PATH path?
It looks like the compiler cannot find the CUDA path correctly.
Could you try to export the parameter?

export LIBRARY_PATH=$LIBRARY_PATH:/usr/local/cuda/lib64

If it is still not working, would you mind to reflash your system and run the following:

$ sudo apt-get install python3-pip
$ pip3 install pycuda --user

Thanks.

Topic		Replies	Views
Install PyCuda on Nano? Jetson Nano pycuda	14	7934	October 18, 2021
Where is the right tutorial for NANO detection implementation? Jetson Nano	9	897	October 14, 2021
Jetson.inference with custom network Jetson Nano jetson-inference	12	2686	October 18, 2021
pycuda installation failure on jetson nano Jetson Nano	28	23956	October 14, 2021
Running tensorRT sample on nano Jetson Nano tensorrt	2	830	October 15, 2021
Getting error as Cuda Runtime (invalid argument) Jetson Nano cuda	12	1701	September 25, 2023
I was unable to compile and install Mxnet1.5 with tensorrt on the jetson nano，Is there someone have compile it, please help me. Thank you. Jetson Nano	34	6251	October 14, 2021
Problems with pycuda installing (tried everything) AGX Orin Jetson Nano cuda	9	3204	May 17, 2023
Official TensorFlow for Jetson Nano! Jetson Nano kb	238	124354	April 5, 2024
No GPU availability through tensorflow Jetson AGX Orin cuda , tensorflow	18	3054	December 21, 2022

Output incorrect with odd number of channels

Related topics