We have run into issues running multi-camera streaming for long times with Argus on our custom hardware. We were recommended – by Nvidia – to try disabling the multiprocess functionality (i.e. removing the argus-daemon from the equation).
In lieu of running the sample applications (We get SCF capture errors when we run with --module="Multi Session") , we ran the DISABLE_MULTIPROCESS on our application. We can stream video, but we are getting constant Cuda errors from our Deepstream NN implementation. We noticed that when DISABLE_MULTIPROCESS=ON we link against “libargus.so” and when running with DISABLE_MULTIPROCESS=OFF the program links against libargus_socketclient.so per the Nvidia recommendations (FindArgus.cmake). libargus_socketclient.so does not depend on cuda at all. However, libargus.so does depend on cuda
Do we know what the libargus.so version is doing with CUDA? Could it be stepping on the toes of the Cuda implementation we are using within our neural network?
Here are the Cuda errors we receive everytime we try to “doInference()”.
error | kERROR: CUDA cask failure at execution for trt_maxwell_scudnn_128x32_relu_medium_nn_v1.
000771 | 20:51:42.897 | 22317 | error | kERROR: caskConvolutionLayer.cpp (235) - Cuda Error in execute: 33
000772 | 20:51:42.897 | 22317 | error | kERROR: caskConvolutionLayer.cpp (235) - Cuda Error in execute: 33
FYI,
The multi-process implementation is now used by default.
this configuration moves the client and server code into separate processes, communicating using a socket-based library between the two processes.
could you please narrow down the issue by comment #2
may I also know how many camera sensors you’re working with for your multi-camera streaming. thanks
ShaneCC,
Yes, no problems there. In fact if we build our code by just NOT enabling the “DISABLE_MULTIPROCESS=ON”, then it runs just fine with no changes to the actual code whatsoever. The CUDA errors only happen when we build with “DISABLE_MULTIPROCESS=ON”
We are on 28.2.1. We know the multi-process implementation runs by default, that is why we are building with DISABLE_MULTIPROCESS=ON to try to run in single process mode without connection to the socket.
We are running with 6 cameras. I have a few other outstanding issues on this forum regarding long running stability of 6 cameras (Max Frames Acquired Errors and AutoControlSync errors that end up crashing the CameraProvider connection to the daemon). Nvidia has recommended we try to run in single process mode while waiting for a “fix” in 32.2. However, we are still weary that the issues will reside in 32.2.