Hi,
I’m working on a TX2 and I’m building up an applciation using a ZED camera and the YOLO port for TensorRT (https://github.com/TLESORT/YOLO-TensorRT-GIE-) but I having lots of cuda errors at runtime.
I got the following errors:
ZED (Init) >> Video mode: HD720@30
[TensorRtEngine] loading yolo_small.prototxt yolo_small.caffemodel
in sl::ERROR_CODE sl::Mat::setFrom(const sl::Mat&, sl::COPY_TYPE) : cuda error [4]: unspecified launch failure.
in sl::ERROR_CODE sl::Mat::updateGPUfromCPU() : cuda error [4]: unspecified launch failure.
cudnnFullyConnectedLayer.cpp (282) - Cuda Error in rowMajorMultiply: 13
[TensorRtEngine] Cuda failure: 4
Sometimes I also encounter the error “cudnnConvolutionLayer.cpp (213) - Cuda Error in execute: 7”
The problem seems to perform the YOLO computation in a thread while the frame grabbing is in a different thread.
Hi, I hope to test the jetpack 3.2 in a few days…
In the meantime, I’m having trouble configuring eclipse on a TX2 to build (and to make autocompletion work) with PCL lib, opencv3 and zed sdk… Is there someone that develop on the tx2? using which IDE?
Hi,
I have installed jetpack3.2. I’m having problems with the zed sdk: the latest version should support cuda9 but the installer says that the sdk is compatible with cuda8 only.
I have installed it anyway but I can’t build any sample in the sdk due to cuda8 requirement.
I have installed the ZED SDK 2.3.1 (beta) as stereolabs suggest: it works with Jetpack3.2.
Anyway I’m still having cuda error at runtime: using this configuration I get the following error:
cudnnActivationLayer.cpp (93) - Cuda Error in execute: 8
Hi,
the output of the command you suggest is the following:
nvidia@tegra-ubuntu:~/tensorrt/bin$ ./giexec --deploy=/home/nvidia/Desktop/tensorrt-test/yolo_small_modified.prototxt --output=result
deploy: /home/nvidia/Desktop/tensorrt-test/yolo_small_modified.prototxt
output: result
Input "data": 3x448x448
Output "result": 1470x1x1
name=data, bindingIndex=0, buffers.size()=2
name=result, bindingIndex=1, buffers.size()=2
Average over 10 runs is 97.824 ms.
Average over 10 runs is 98.756 ms.
Average over 10 runs is 98.4284 ms.
Average over 10 runs is 97.9406 ms.
Average over 10 runs is 98.6445 ms.
Average over 10 runs is 98.3895 ms.
Average over 10 runs is 97.8623 ms.
Average over 10 runs is 98.3947 ms.
Average over 10 runs is 98.4803 ms.
Average over 10 runs is 97.7743 ms.
is it ok?
I have sent an email to the stereolabs support explaining what is happening and this is their answer:
The issue probably comes from having two different CUDA context in a single application.
Your solution is to initialize the ZED with the CUDA context from YOLO or initiate YOLO with the CUDA context created by the ZED.
You can set the ZED SDK context using InitParameters.sdk_cuda_ctx, or let it create one and get it with getCUDAContext().
I’m trying to do that but the type of the context used by the ZED sdk is a CUcontext (which is a struct CUctx_st*). In TensorRT code, the only ‘context’ I use is the IExecutionContext.
Can I convert it to a CUcontext (if possible)?
Is there a way to try what stereolabs support suggests?
Hi AastaLLL,
I think I did not understand…
I can share a sample code with ZED sdk and YOLO_with_TensorRT (but I need to remove some parts).
It’s ok? Or do you need something else?
Hi,
I don’t know how to reproduce this error without a zed camera.
Maybe you can make 2 threads, one working with a cuda application and the other one running a TensorRt inference.
Thanks