[32.3.1] gstreamer pipeline error and cmda_handle_timeout with OV5693

Hi, I’m trying the following script with OV5693.
gst-launch-1.0 -vvvv nvarguscamerasrc \ ! 'video/x-raw(memory:NVMM),width=2592,height=1944' \ ! queue max-size-buffers=1 leaky=2 name=q1 \ ! tee name=t \ t. ! queue name=q3 \ ! nvvidconv name=testLoc flip-method=2 left=1692 right=2172 top=900 bottom=1280 \ ! 'video/x-raw(memory:NVMM),width=480,height=380' \ ! nvoverlaysink name=displaySink async=false sync=false overlay-x=0 overlay-y=361 overlay-w=480 overlay-h=380 \ t. ! queue name=q4 \ ! valve drop=true \ ! nvvidconv \ ! clockoverlay time-format=\"%D %H:%M:%S\" halignment=right \ ! nvjpegenc ! multifilesink name=testPicture async=false location=/tmp/pic1.jpg \ t. ! queue name=q5 \ ! nvvidconv name=b ! 'video/x-raw(memory:NVMM),width=(int)1920,height=(int)1080' \ ! nvivafilter cuda-process=true customer-lib-name="libnvsample_cudaprocess.so" \ ! 'video/x-raw(memory:NVMM),width=(int)1920,height=(int)1080, format=NV12' \ ! nvv4l2h264enc maxperf-enable=1 bitrate=8000000 \ ! 'video/x-h264, stream-format=(string)byte-stream' \ ! h264parse \ ! valve drop=true \ ! splitmuxsink name=testVideoSink async-handling=true location=/tmp/tmpFile

Here’s how I created the issue:

  1. sudo systemctl stop nvargus-daemon
  2. sudo rm -f /tmp/argus_socket
  3. run nvargus-daemon as a foreground process to monitor outputs on the same terminal
  4. run the gstreamer script above
  5. stop the script with ctrl-c
  6. when the script exit is hanging due to pipeline clean-up error, stop nvargus-daemon with ctrl-c.

Steps 3-6 may need to be repeated, as gstreamer pipeline clean-up in step 5 sometimes works well.

When there is a gstreamer pipeline error, and nvargus-daemon was stopped with ctrl-c, dmesg shows the following:
`[ +25.639565] host1x 13e10000.host1x: cdma_handle_timeout: timeout: 43 (15700000.vi_2) clientid 3057, HW thresh 2269, done 2270
[ +0.011436] host1x 13e10000.host1x: cdma_handle_timeout: timeout: 42 (15700000.vi_1) clientid 3057, HW thresh 1293, done 1293
[ +0.011424] host1x 13e10000.host1x: cdma_handle_timeout: timeout: 41 (15700000.vi_0) clientid 3057, HW thresh 1295, done 1295
[ +0.011378] ---- mlocks ----

[ +0.000033] ---- syncpts ----
[ +0.000017] id 5 (disp_d) min 6185 max 6186 refs 1 (previous client : )
[ +0.000006] id 6 (disp_e) min 1574 max 1574 refs 1 (previous client : )
[ +0.000004] id 7 (disp_f) min 2 max 2 refs 1 (previous client : )
[ +0.000005] id 8 (vblank1) min 102589 max -4 refs 1 (previous client : )
[ +0.000016] id 19 (15340000.vic_nvargus-daemon_0) min 2 max 2 refs 1 (previous client : 15340000.vic_nvargus-daemon_0)
[ +0.000004] id 20 (gp10b_507) min 332206 max 332206 refs 1 (previous client : )
[ +0.000003] id 21 (gp10b_506) min 22 max 22 refs 1 (previous client : )
[ +0.000005] id 22 (gp10b_505) min 32342 max 32344 refs 1 (previous client : gp10b_505)
[ +0.000004] id 23 (15340000.vic_gst-launch-1.0_0) min 6915 max 6915 refs 1 (previous client : 15340000.vic_gst-launch-1.0_0)
[ +0.000008] id 28 (15340000.vic_gst-launch-1.0_0) min 2198 max 2198 refs 1 (previous client : 15340000.vic_gst-launch-1.0_0)
[ +0.000004] id 29 (15600000.isp_nvargus-daemon_0) min 2630 max 2630 refs 3 (previous client : 15600000.isp_nvargus-daemon_0)
[ +0.000004] id 30 (15600000.isp_nvargus-daemon_1) min 2117 max 2117 refs 3 (previous client : 15600000.isp_nvargus-daemon_1)
[ +0.000005] id 31 (15600000.isp_nvargus-daemon_2) min 3389 max 3389 refs 3 (previous client : 15600000.isp_nvargus-daemon_2)
[ +0.000004] id 32 (15600000.isp_nvargus-daemon_3) min 2259 max 2259 refs 3 (previous client : 15600000.isp_nvargus-daemon_3)
[ +0.000004] id 33 (15600000.isp_nvargus-daemon_4) min 9861 max 9861 refs 4 (previous client : 15600000.isp_nvargus-daemon_4)
[ +0.000004] id 34 (15600000.isp_nvargus-daemon_5) min 2259 max 2259 refs 3 (previous client : 15600000.isp_nvargus-daemon_5)
[ +0.000004] id 35 (gp10b_490) min 7092 max 7092 refs 1 (previous client : gp10b_490)
[ +0.000004] id 36 (gp10b_491) min 1038 max 1038 refs 1 (previous client : gp10b_491)
[ +0.000003] id 37 (gp10b_489) min 15 max 15 refs 1 (previous client : gp10b_489)
[ +0.000004] id 38 (gp10b_492) min 149 max 149 refs 1 (previous client : gp10b_492)
[ +0.000004] id 39 (gp10b_493) min 122 max 122 refs 1 (previous client : gp10b_493)
[ +0.000004] id 40 (15700000.vi_0) min 223 max 223 refs 2 (previous client : 15700000.vi_0)
[ +0.000004] id 41 (15700000.vi_0) min 1295 max 1295 refs 3 (previous client : 15700000.vi_0)
[ +0.000004] id 42 (15700000.vi_1) min 1293 max 1293 refs 3 (previous client : 15700000.vi_1)
[ +0.000003] id 43 (15700000.vi_2) min 2270 max 2270 refs 3 (previous client : 15700000.vi_2)
[ +0.000014] id 53 (gp10b_495) min 54 max 54 refs 1 (previous client : 15340000.vic_argus_thread_0)
[ +0.000004] id 54 (gp10b_496) min 1410 max 1410 refs 1 (previous client : gp10b_490)
[ +0.000004] id 55 (gp10b_498) min 2561 max 2561 refs 1 (previous client : gp10b_491)
[ +0.000004] id 56 (gp10b_494) min 6 max 6 refs 1 (previous client : gp10b_492)
[ +0.000004] id 57 (gp10b_497) min 6 max 6 refs 1 (previous client : gp10b_493)
[ +0.000004] id 58 (15340000.vic_V4L2_EncThread_0) min 18 max 18 refs 1 (previous client : 15340000.vic_argus_thread_0)
[ +0.000004] id 59 (154c0000.nvenc_V4L2_EncThread_0) min 1458 max 1458 refs 1 (previous client : 154c0000.nvenc_V4L2_EncThread_0)
[ +0.000004] id 60 (15340000.vic_V4L2_EncThread_0) min 2003 max 2003 refs 1 (previous client : 15340000.vic_V4L2_EncThread_0)
[ +0.000005] id 61 (15340000.vic_gst-launch-1.0_0) min 2028 max 2028 refs 1 (previous client : 15340000.vic_argus_thread_0)

[ +0.000562] ---- channels ----
[ +0.000016]
channel 2 - 15820000.se

[ +0.000002] NvHost basic channel registers:
[ +0.000004] CMDFIFO_STAT_0: 00002040
[ +0.000003] CMDFIFO_RDATA_0: a0da6240
[ +0.000005] CMDP_OFFSET_0: 00000000
[ +0.000002] CMDP_CLASS_0: 00000000
[ +0.000003] CHANNELSTAT_0: 00000000
[ +0.000002] The CDMA sync queue is empty.

[ +0.000006]
channel 3 - 15830000.se

[ +0.000002] NvHost basic channel registers:
[ +0.000002] CMDFIFO_STAT_0: 00002040
[ +0.000002] CMDFIFO_RDATA_0: 0a0d100a
[ +0.000004] CMDP_OFFSET_0: 00000000
[ +0.000002] CMDP_CLASS_0: 00000000
[ +0.000003] CHANNELSTAT_0: 00000000
[ +0.000001] The CDMA sync queue is empty.

[ +0.000004]
channel 4 - 15840000.se

[ +0.000002] NvHost basic channel registers:
[ +0.000002] CMDFIFO_STAT_0: 00002040
[ +0.000002] CMDFIFO_RDATA_0: 00440806
[ +0.000004] CMDP_OFFSET_0: 00000000
[ +0.000004] CMDP_CLASS_0: 00000000
[ +0.000002] CHANNELSTAT_0: 00000000
[ +0.000002] The CDMA sync queue is empty.

[ +0.000004]
channel 7 - 15600000.isp

[ +0.000001] NvHost basic channel registers:
[ +0.000003] CMDFIFO_STAT_0: 00002040
[ +0.000002] CMDFIFO_RDATA_0: e1000032
[ +0.000004] CMDP_OFFSET_0: 00000100
[ +0.000002] CMDP_CLASS_0: 00000000
[ +0.000003] CHANNELSTAT_0: 00000000
[ +0.000005]
ffffffc190cd1400: JOB, syncpt_id=29, syncpt_val=2630, first_get=00000f80, timeout=10000, num_slots=19
[ +0.000008] GATHER at ffc00000+3c50, 12 words
[ +0.000003] 20000001
[ +0.000002] 0000641d
[ +0.000001] 20000001
[ +0.000002] 0000681e
[ +0.000004] 20000001
[ +0.000004] 00006c1f
[ +0.000002] 20000001
[ +0.000001] 00007c20
[ +0.000002] 20000001
[ +0.000002] 00007422
[ +0.000001] 200c0001
[ +0.000002] 00000005

[ +0.000004] GATHER at ffc00000+3c80, 2 words
[ +0.000002] 20000001
[ +0.000001] 00000021

[ +0.000007]
channel 1 - 15700000.vi

[ +0.000002] NvHost basic channel registers:
[ +0.000002] CMDFIFO_STAT_0: 00002040
[ +0.000002] CMDFIFO_RDATA_0: 00000028
[ +0.000006] CMDP_OFFSET_0: 00000000
[ +0.000002] CMDP_CLASS_0: 00000001
[ +0.000002] CHANNELSTAT_0: 00000000
[ +0.000002] The CDMA sync queue is empty.

[ +0.000004]
channel 5 - 15700000.vi

[ +0.000002] NvHost basic channel registers:
[ +0.000002] CMDFIFO_STAT_0: 00002040
[ +0.000002] CMDFIFO_RDATA_0: 00000029
[ +0.000004] CMDP_OFFSET_0: 00000000
[ +0.000002] CMDP_CLASS_0: 00000001
[ +0.000003] CHANNELSTAT_0: 00000000
[ +0.000005]
ffffffc190cd3400: JOB, syncpt_id=43, syncpt_val=2270, first_get=00000ae0, timeout=10000, num_slots=7
[ +0.000006] GATHER at ffc50000+07ac, 6 words
[ +0.000004] 304e0005
[ +0.000005] 00002683
[ +0.000003] 00000021
[ +0.000002] 304e0005
[ +0.000002] 0000050c
[ +0.000001] 0000002a

[ +0.000004] GATHER at ffc50000+07c4, 4 words
[ +0.000002] 90000002
[ +0.000001] a0004021
[ +0.000002] 00000000
[ +0.000002] 00000000

[ +0.000003] GATHER at ffc50000+07d4, 26 words
[ +0.000001] 90000001
[ +0.000002] a0000401
[ +0.000001] 00000000
[ +0.000003] 90000001
[ +0.000002] a000404d
[ +0.000003] 00000001
[ +0.000003] 90000001
[ +0.000004] a0004020
[ +0.000002] 00000000
[ +0.000003] 90000003
[ +0.000003] a0004042
[ +0.000003] 5a1b9000
[ +0.000005] 00000001
[ +0.000003] 00000001
[ +0.000002] 90000001
[ +0.000003] a0004048
[ +0.000002] 00000000
[ +0.000003] 90000001
[ +0.000002] a000400a
[ +0.000003] 00000000
[ +0.000005] 90000001
[ +0.000001] a0004001
[ +0.000002] 00000001
[ +0.000003] 90000001
[ +0.000001] a0004007
[ +0.000002] 00000003

[ +0.000003] GATHER at ffc50000+083c, 2 words
[ +0.000001] 20000001
[ +0.000002] 00000029

[ +0.000005]
---- host general irq ----

[ +0.000003] sync_intc0mask = 0x00000001
[ +0.000002] sync_intmask = 0x50000003
[ +0.000002]
---- host syncpt irq mask ----

[ +0.000001]
---- host syncpt irq status ----

[ +0.000004] syncpt_thresh_cpu0_int_status(0) = 0x00000000
[ +0.000003] syncpt_thresh_cpu0_int_status(1) = 0x00000000
[ +0.000003] syncpt_thresh_cpu0_int_status(2) = 0x00000000
[ +0.000003] syncpt_thresh_cpu0_int_status(3) = 0x00000000
[ +0.000003] syncpt_thresh_cpu0_int_status(4) = 0x00000000
[ +0.000003] syncpt_thresh_cpu0_int_status(5) = 0x00000000
[ +0.000002] syncpt_thresh_cpu0_int_status(6) = 0x00000000
[ +0.000003] syncpt_thresh_cpu0_int_status(7) = 0x00000000
[ +0.000002] syncpt_thresh_cpu0_int_status(8) = 0x00000000
[ +0.000003] syncpt_thresh_cpu0_int_status(9) = 0x00000000
[ +0.000002] syncpt_thresh_cpu0_int_status(10) = 0x00000000
[ +0.000003] syncpt_thresh_cpu0_int_status(11) = 0x00000000
[ +0.000002] syncpt_thresh_cpu0_int_status(12) = 0x00000000
[ +0.000003] syncpt_thresh_cpu0_int_status(13) = 0x00000000
[ +0.000002] syncpt_thresh_cpu0_int_status(14) = 0x00000000
[ +0.000003] syncpt_thresh_cpu0_int_status(15) = 0x00000000
[ +0.000003] syncpt_thresh_cpu0_int_status(16) = 0x00000000
[ +0.000003] syncpt_thresh_cpu0_int_status(17) = 0x00000000
[ +0.000015] tegra-vi4 15700000.vi: SYNCPT_ID 43
[ +0.000003] tegra-vi4 15700000.vi: SYNCPT_VAL 2270
[ +0.000002] tegra-vi4 15700000.vi: FIRST_GET 0xae0
[ +0.000002] tegra-vi4 15700000.vi: TIMEOUT 10000
[ +0.000003] tegra-vi4 15700000.vi: NUM_SLOTS 7
[ +0.000002] tegra-vi4 15700000.vi: NUM_HANDLES 2
[ +0.011552] cdma_stop: DMAPUT != DMAGET for channel 5, stopping CDMA`

From what I’ve read from the nvidia forum so far, this seems to be a sync-point timeout error. As this is happening with OV5693 camera (default camera on TX2 dev platform), I’m assuming that I would see the same issue with the imx327 camera I have from Leopard Imaging. What can I do to fix the issue?

Could you check if can repo by reduce the element from the pipeline to narrow down which element cause the problem.

It’s the 3rd queue that caused the issue:
t. ! queue name=q5 \ ! nvvidconv name=b ! 'video/x-raw(memory:NVMM),width=(int)1920,height=(int)1080' \ ! nvivafilter cuda-process=true customer-lib-name="libnvsample_cudaprocess.so" \ ! 'video/x-raw(memory:NVMM),width=(int)1920,height=(int)1080, format=NV12' \ ! nvv4l2h264enc maxperf-enable=1 bitrate=8000000 \ ! 'video/x-h264, stream-format=(string)byte-stream' \ ! h264parse \ ! valve drop=true \ ! splitmuxsink name=testVideoSink async-handling=true location=/tmp/tmpFile

Looks similar to this topic:

I fail to reproduce this (running as user nvidia). My AGX Xavier is running R32.4.4:

gst-launch-1.0 -v nvarguscamerasrc ! 'video/x-raw(memory:NVMM),width=2592,height=1944' ! queue max-size-buffers=1 leaky=2 name=q1 ! tee name=t \
 t. ! queue name=q3 ! nvvidconv name=testLoc flip-method=2 left=1692 right=2172 top=900 bottom=1280 ! 'video/x-raw(memory:NVMM),width=480,height=380' ! nvoverlaysink name=displaySink async=false sync=false overlay-x=0 overlay-y=360 overlay-w=480 overlay-h=380 \
 t. ! queue name=q4 ! valve drop=true ! nvvidconv ! clockoverlay time-format=\"%D %H:%M:%S\" halignment=right ! nvjpegenc ! multifilesink name=testPicture async=true location=/home/nvidia/tmp/pic1.jpg \
 t. ! queue name=q5 ! nvvidconv name=b ! 'video/x-raw(memory:NVMM),width=(int)1920,height=(int)1080' ! nvivafilter cuda-process=true customer-lib-name="libnvsample_cudaprocess.so"  ! 'video/x-raw(memory:NVMM),width=(int)1920,height=(int)1080, format=NV12' ! nvv4l2h264enc maxperf-enable=1 bitrate=8000000 ! 'video/x-h264, stream-format=(string)byte-stream' ! h264parse ! valve drop=true ! splitmuxsink name=testVideoSink async-handling=true location=/home/nvidia/tmp/tmpFile.h264

You may try to upgrade. Seems you’re using OV5693 sensor, it should be supported in upgrades.

you have to remove “/tmp/argus_socket” as I showed step 2 of the steps of how to re-create the issue above.

It appears to be the same problem I have been fighting for the last year. We have the problem with a third party camera, but today we got the demo board and the nvidia camera to show the same problem running my script/gstreamer app with three tee’s. The video does not start 100% of the time.

r32.3.1 tx2-4g still waiting for Nvidia to find a fix.

Terry

I did. And tried to stop pipeline in various points and restart as soon as possible…But I got no such error. How many times do you need for triggering this ?

@Honey_Patouceul I had to repeat at least 3 or 4 times, to cause the pipeline issue. I just noticed that you’re using an xavier with 32.4.4. I haven’t used 32.4.4 nor xavier , so I’m not sure if you could recreate the issue at all on the xavier platform with 32.4.4.

@ShaneCCC so what debugging do you want us to add to our systems to get you data that will help you fix the OS.

I have been waiting a year, so You have two people willing to provide you information.

I need the pipeline with three tee’s to make my app work correctly,

I also can’t set the state to NULL to terminate the pipeline. So my solutions is to kill nvargus-daemon, reloading the vendor’s camera driver and then killing my gstreamer app. Maybe if we could set the state to NULL the app would come up correctly the next time.

What more info do you NEED? This is a timing issue in my case because if I restart my gstreamer app slowly, it comes up more often, I can kill it by beating on it as quickly as possible.

It also seems to terminate my gstreamers app without messages if I am running a make -j10, and all the processors in my case it is 4 out of 6 are being stressed.

Terry

Hi @ji_joe ,

Are you the camera vendor that terrysu50z mentioned before? If so, may I know is it possible for you to provide the camera driver on jp4.4. or jp4.5.1 for terrysu50z? Generally, if there is any bug found, we usually want user to try latest release first. Our debug all starts from the latest release too.

@terrysu50z
Due to your pipeline was involved many element it’s difficult to debug to figure out which module cause.
It would be great if can simplify the pipeline to reproduce the issue on latest release and we can working with developer to figure the root cause.

Hi ji_joe and terrysu50z,

Is there any result can be shared? Any update?

I don’t have a camera driver for the latest release, so I am dead in the water.

Please suggest debugging that I can do and send you results.

My app can’t set the state to NULL, it just hangs, also I implemented get_state and it just hangs also.

I can put in debugging and do what ever the developers think would provide them with information.

They can supply new libs and I can try them.

Thanks
Terry

It is my complicated system that is causing the problem, simple systems work, but I need all of the functionality in my product.

Please send/provide debugging information.
Terry

I just want to make sure what is the exact command you are using to hit this problem? Is it what @ji_joe talking about in the first comment? Or you have something else? What is the error in dmesg?

@ji_joe ,

If the 3rd queue would have problem, will it hit error if you only run the pipeline in the 3rd queue?

I do not know. I never tried the 3rd queue alone.

Then can you try? Is this @terrysu50z’s pipeline or it is found by you?

it’s @terrysu50z’s pipeline. I actually do not know whether or not the 3rd queue itself will work at all.

Having said that, I listed everything here for you to recreate the issue. You can easily run the cmd yourself to see what’s happening. I personally think that it makes a lot more sense for @WayneWWW your team to try on your own, to see what’s really going on. Then your team accurately report the symptoms to your development team.

Oh. I see. Thanks.
We can try it but need few days for the set up. Actually, this does make sense for you to tell us what the pipeline looks like. I mean, this should be @terrysu50z’s responsibility.

Terry, it is possible to narrow down your pipeline to a simplified version and see if the issue is still? That is the current debug tip we can ask you to try.