Hi,
We are working with an Orin NX JP 5.1.1 on a custom carrier board, streaming from two sensors 3840x2160@30fps, and we have two different setups, one that uses GMSL2 and another one that connects the sensors directly through MIPI. We are currently facing an issue on both setups in which sometimes (~ 1/15 times) when streaming using nvarguscamerasrc after power up, the following error happens.
Setting pipeline to PAUSED ...
Opening in BLOCKING MODE
Pipeline is live and does not need PREROLL ...
Setting pipeline to PLAYING ...
New clock: GstSystemClock
Redistribute latency...
NvMMLiteOpen : Block : BlockType = 4
===== NVMEDIA: NVENC =====
NvMMLiteBlockCreate : Block : BlockType = 4
Error generated. gstnvarguscamerasrc.cpp, threadExecute:694 NvBufSurfaceFromFd Failed.
Error generated. gstnvarguscamerasrc.cpp, threadFunction:247 (propagating)
GST_ARGUS: Creating output stream
CONSUMER: Waiting until producer is connected...
GST_ARGUS: Available Sensor modes :
GST_ARGUS: 3840 x 2160 FR = 29.999999 fps Duration = 33333334 ; Analog Gain range min 1.000000, max 1.375000; Exposure Range min 59000, max 32003000;
GST_ARGUS: Running with following settings:
Camera index = 1
Camera mode = 0
Output Stream W = 3840 H = 2160
seconds to Run = 0
Frame Rate = 29.999999
GST_ARGUS: Setup Complete, Starting captures for 0 seconds
GST_ARGUS: Starting repeat capture requests.
CONSUMER: Producer has connected; continuing.
nvbuf_utils: dmabuf_fd -1 mapped entry NOT found
Got EOS from element "pipeline0".
Execution ended after 0:00:14.032537856
Setting pipeline to NULL ...
We have seen this issue with different GStreamer pipelines and with an app that uses libargus directly. We also don’t suspect this being an issue with the driver or the sensor because v4l2 doesn’t show the same failure to stream after boot.
Any suggestions on what might be going on?
The problem only show while open two sensor simultaneously?
Thanks
Hi ShaneCCC,
No, we have seen this issue even when streaming from just one sensor.
Did you boost the clocks to try?
Modify the control function like exposure/frame rate to dummy function to confirm it those control function cause the problem.
Hi @ShaneCCC,
I have been working with @carolina.trejos on this issue. We are seeing the following error constantly:
Error generated. gstnvarguscamerasrc.cpp, threadExecute:694 NvBufSurfaceFromFd Failed.
I started looking into gstnvarguscamerasrc.cpp
to review what causes this issue and noticed there are 2 causes for the error triggering (NULL pointer of NvBufSurface or function returns -1).
To further track down what is the issue, I added more debug messages and now am seeing that the error is triggering because the NvBufSurfaceFromFd
function returns -1.
What could cause this?
Modify the control function like exposure/frame rate
I applied this change for future testing
Did you boost the clocks to try?
Boosting clocks still showed the previously sent error.
Please also use the argus_camera to clarify the problem too.
Thanks
@ShaneCCC I will test that as well.
We took a deeper look into the nvarguscamerasrc and noticed that the error is being triggered by the waitForEvents
method:
event_status = iEventProvider_ptr->waitForEvents(src->queue.get(), WAIT_FOR_EVENT_TIMEOUT);
g_mutex_unlock(&src->queue_lock);
if (iEventQueue_ptr->getSize() == 0)
{
g_mutex_lock (&src->argus_buffers_queue_lock);
src->stop_requested = TRUE;
g_mutex_unlock (&src->argus_buffers_queue_lock);
break;
}
This was returning a 0 size pointer. I added loop here to allow for retrying this. It has been successful in retrying. However, after this stream still failed. I tried restarting nvargus-daemon
after all software reboots in case it is an issue with the daemon. The error I see after this is:
Setting pipeline to PAUSED ...
Opening in BLOCKING MODE
Pipeline is live and does not need PREROLL ...
Setting pipeline to PLAYING ...
New clock: GstSystemClock
Redistribute latency...
NvMMLiteOpen : Block : BlockType = 4
===== NVMEDIA: NVENC =====
NvMMLiteBlockCreate : Block : BlockType = 4
Error generated. gstnvarguscamerasrc.cpp, threadExecute:731 NvBufSurfaceFromFd Failed. Returned 0
Error generated. gstnvarguscamerasrc.cpp, threadFunction:247 (propagating)
ERROR: from element /GstPipeline:pipeline0/GstNvArgusCameraSrc:nvarguscamerasrc0: CANCELLED
Additional debug info:
Argus Error Status
GST_ARGUS: Creating output stream
CONSUMER: Waiting until producer is connected...
GST_ARGUS: Available Sensor modes :
GST_ARGUS: 3840 x 2160 FR = 29.999999 fps Duration = 33333334 ; Analog Gain range min 1.000000, max 1.375000; Exposure Range min 59000, max 32003000;
GST_ARGUS: Running with following settings:
Camera index = 0
Camera mode = 0
Output Stream W = 3840 H = 2160
seconds to Run = 0
Frame Rate = 29.999999
GST_ARGUS: Setup Complete, Starting captures for 0 seconds
GST_ARGUS: Starting repeat capture requests.
CONSUMER: Producer has connected; continuing.
nvbuf_utils: dmabuf_fd -1 mapped entry NOT found
EOS on shutdown enabled -- waiting for EOS after Error
Which is triggered here:
if (iEvent->getEventType() == EVENT_TYPE_ERROR)
{
if (src->stop_requested == TRUE || src->timeout_complete == TRUE)
break;
src->argus_in_error = TRUE;
const IEventError* iEventError = interface_cast<const IEventError>(event);
Argus::Status argusStatus = iEventError->getStatus();
error = g_error_new_literal (domain, argusStatus, getStatusString(argusStatus));
GstMessage *message = gst_message_new_error (GST_OBJECT(src), error, "Argus Error Status");
gst_element_post_message (GST_ELEMENT_CAST(src), message);
g_mutex_lock (&src->argus_buffers_queue_lock);
src->stop_requested = TRUE;
g_mutex_unlock (&src->argus_buffers_queue_lock);
break;
}
This error has no further description other than Argus Error Status
. Since this is not a helpful description we are unable to know how this can be solved. What can cause this issue to be triggered? Are there any other recommended steps based on this new information?
This is the retry cycle I added to avoid the waitForEvents
returning an empty queue:
while (event_retry < 5) {
event_status = iEventProvider_ptr->waitForEvents(src->queue.get(), WAIT_FOR_EVENT_TIMEOUT);
if (event_status == STATUS_OK) {
GST_ARGUS_PRINT("Event status ok\n");
} else {
GST_ARGUS_PRINT("Event status NOT ok\n");
}
if (iEventQueue_ptr->getSize() == 0) {
event_retry++;
GST_ARGUS_PRINT("Waiting for events\n");
} else {
break;
}
}
Its a bit crude but the retry is working. The Argus status
is seen to be CANCELLED
. Is there a way to enable capture once Argus is in this CANCELLED
state?
Try to terminal the APP and restart new one if get CANCLLED state.
To reproduce this issue all we do is run a GStreamer pipeline right after booting up the board. We have seen this happen even with the simplest possible pipeline.
gst-launch-1.0 nvarguscamerasrc sensor-id=0 ! fpsdisplaysink text-overlay=0 video-sink=fakesink sync=0 -v
Do you run the gst-launch-1.0 manually or have any script?
Both ways, we have seen it fail when running it manually, but we also created a service that ran a script with the pipeline after the argus daemon service.
Could you share the service and script here.
Also could you start the nvargus-daemon by infinitetimeout to try.
sudo service nvargus-daemon stop
sudo enableCamInfiniteTimeout=1 nvargus-daemon
Thanks
@carolina.trejos
We can’t repo the problem with imx219 on Orin NX.
Below is the reproduce step.
- Reboot device
- When device boot up and run below command quickly:
$ gst-launch-1.0 nvarguscamerasrc sensor-id=0 ! fpsdisplaysink text-overlay=0 video-sink=fakesink sync=0 -v
→ Repeat steps 1-2 30 times, no issue.
What resolution and framerate are you using with the imx219?
1080p@30fps
Thanks
nvidia@tegra-ubuntu:~$ gst-launch-1.0 nvarguscamerasrc sensor-id=0 ! fpsdisplaysink text-overlay=0 video-sink=fakesink sync=0 -v
Setting pipeline to PAUSED ...
Pipeline is live and does not need PREROLL ...
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0/GstFakeSink:fakesink0: sync = false
Setting pipeline to PLAYING ...
New clock: GstSystemClock
/GstPipeline:pipeline0/GstNvArgusCameraSrc:nvarguscamerasrc0.GstPad:src: caps = video/x-raw(memory:NVMM), width=(int)1920, height=(int)1080, format=(string)NV12, framerate=(fraction)30/1
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0.GstGhostPad:sink.GstProxyPad:proxypad0: caps = video/x-raw(memory:NVMM), width=(int)1920, height=(int)1080, format=(string)NV12, framerate=(fraction)30/1
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0/GstFakeSink:fakesink0.GstPad:sink: caps = video/x-raw(memory:NVMM), width=(int)1920, height=(int)1080, format=(string)NV12, framerate=(fraction)30/1
/GstPipeline:pipeline0/GstFPSDisplaySink:fpsdisplaysink0.GstGhostPad:sink: caps = video/x-raw(memory:NVMM), width=(int)1920, height=(int)1080, format=(string)NV12, framerate=(fraction)30/1
GST_ARGUS: Creating output stream
CONSUMER: Waiting until producer is connected...
GST_ARGUS: Available Sensor modes :
GST_ARGUS: 3280 x 2464 FR = 21.000000 fps Duration = 47619048 ; Analog Gain range min 1.000000, max 10.625000; Exposure Range min 13000, max 683709000;
GST_ARGUS: 3280 x 1848 FR = 28.000001 fps Duration = 35714284 ; Analog Gain range min 1.000000, max 10.625000; Exposure Range min 13000, max 683709000;
GST_ARGUS: 1920 x 1080 FR = 29.999999 fps Duration = 33333334 ; Analog Gain range min 1.000000, max 10.625000; Exposure Range min 13000, max 683709000;
GST_ARGUS: 1640 x 1232 FR = 29.999999 fps Duration = 33333334 ; Analog Gain range min 1.000000, max 10.625000; Exposure Range min 13000, max 683709000;
GST_ARGUS: 1280 x 720 FR = 59.999999 fps Duration = 16666667 ; Analog Gain range min 1.000000, max 10.625000; Exposure Range min 13000, max 683709000;
GST_ARGUS: Running with following settings:
Camera index = 0
Camera mode = 2
Output Stream W = 1920 H = 1080
seconds to Run = 0
Frame Rate = 29.999999
GST_ARGUS: Setup Complete, Starting captures for 0 seconds
Is there any way you could test with a higher resolution? We are using 3840x2160, any chance that could affect?
Also, we have tested with the enableCamInfiniteTimeout enabled, and the issue still happens.
@carolina.trejos
We still unable to repo the problem with 3280x2464@21 sensor mode.
Do you have any other module for clarifying?
Unfortunately we don’t have another module, but we are testing JP 5.1.2 to see if the issue persists.