[REOPEN] Nvstreamdemux does not copy obj_meta parent structure to src pad

And

And

Good luck getting anything fixed here. The pattern is:

  • user raises issue
  • nvidia ask for something random
  • issue closed due to inactivity.

FYI the solution is you can probe the pad before the data is corrupt and store it in some sort of hashmap and retrieve from there.

2 Likes

😂

Might script something to repost every day to keep the issue alive and mess with their open issue KPIs so someone in management actually notices…

I’ve posted 2 different options for the the proper fix in my second post though

1 Like

Yes. I think with the source code it would be fixed in minutes. The problem is this forum is a black hole where issues go to die.

@NVIDIA Developer Advocates: PLEASE manage issues like the rest of the world in GitHub - or even better, open source everything (via GitHub) and we would have fixed this for you years ago.

1 Like

And…

And

@eh-steve Thanks for the sharing! I can reproduce this “sgie detection object’s parent is null” issue using the code on Feb 2, we are investigating. if it is confirmed to be a bug we will fix it in the latter DeepStream version.

Great,

Did you manage to test the 3 different combinations (no streamdemux, 2-pass hashmap lookup for nvds_copy_obj_meta_list, and append-only nvds_add_meta_to_parent)?

Also

As a follow up for anyone looking for a fix for Tegra:

Unfortunately the arm64 assembly doesn’t include a PLT relocation to nvds_copy_obj_meta_list inside nvds_copy_frame_meta - instead it hard codes a relative jump within the .so itself, preventing the ELF symbol interposition trick:

0000000000002d90 <nvds_copy_frame_meta@@Base>:

...
    2e28:	94000044 	bl	2f38 <nvds_release_meta_lock@@Base>
    2e2c:	f9403680 	ldr	x0, [x20, #104]
    2e30:	aa1303e1 	mov	x1, x19
    2e34:	97ffffaf 	bl	2cf0 <nvds_copy_obj_meta_list@@Base>  // This is a hardcoded jump within the same .so file
    2e38:	f9403a80 	ldr	x0, [x20, #112]
    2e3c:	aa1303e1 	mov	x1, x19
    2e40:	97fffef0 	bl	2a00 <nvds_copy_display_meta_list@@Base>
    2e44:	aa1303e1 	mov	x1, x19
    2e48:	a9415bf5 	ldp	x21, x22, [sp, #16]
    2e4c:	f94013fe 	ldr	x30, [sp, #32]
    2e50:	f9403e80 	ldr	x0, [x20, #120]
    2e54:	a8c353f3 	ldp	x19, x20, [sp], #48
    2e58:	17ffff42 	b	2b60 <nvds_copy_frame_user_meta_list@@Base>

The alternative is to patch the preprending implementation inside nvds_add_meta_to_parent, which is an unexported function in the arm64 version of libnvds_meta.so, replacing the conditional branch with a NOP to always append:

    39c0:	a9bd53f3 	stp	x19, x20, [sp, #-48]!
    39c4:	aa0103f3 	mov	x19, x1
    39c8:	aa0003f4 	mov	x20, x0
    39cc:	a9015bf5 	stp	x21, x22, [sp, #16]
    39d0:	2a0203f5 	mov	w21, w2
    39d4:	f9400036 	ldr	x22, [x1]
    39d8:	f90013fe 	str	x30, [sp, #32]
    39dc:	aa1603e0 	mov	x0, x22
    39e0:	97fffd54 	bl	2f30 <nvds_acquire_meta_lock@@Base>
    39e4:	aa1303e1 	mov	x1, x19
    39e8:	aa1403e0 	mov	x0, x20
    39ec:	34000155 	cbz	w21, 3a14 <nvds_get_user_meta_type@@Base+0x4f4>  // Replace this conditional branch with a NOP to always append, since we have a RET later
    39f0:	97fffad8 	bl	2550 <g_list_append@plt>
    39f4:	aa0003f3 	mov	x19, x0
    39f8:	aa1603e0 	mov	x0, x22
    39fc:	97fffd4f 	bl	2f38 <nvds_release_meta_lock@@Base>
    3a00:	aa1303e0 	mov	x0, x19
    3a04:	a9415bf5 	ldp	x21, x22, [sp, #16]
    3a08:	f94013fe 	ldr	x30, [sp, #32]
    3a0c:	a8c353f3 	ldp	x19, x20, [sp], #48
    3a10:	d65f03c0 	ret
    3a14:	97fffae7 	bl	25b0 <g_list_prepend@plt>
    3a18:	aa0003f3 	mov	x19, x0
    3a1c:	aa1603e0 	mov	x0, x22
    3a20:	97fffd46 	bl	2f38 <nvds_release_meta_lock@@Base>
    3a24:	aa1303e0 	mov	x0, x19
    3a28:	a9415bf5 	ldp	x21, x22, [sp, #16]
    3a2c:	f94013fe 	ldr	x30, [sp, #32]
    3a30:	a8c353f3 	ldp	x19, x20, [sp], #48
    3a34:	d65f03c0 	ret

In the DS 6.3 version of this binary, this means finding the offset of the cbz w21, 3a14 bytes (55 01 00 34 @ 0x39ec) and replacing them with a NOP instruction bytes of 1f 20 03 d5:

echo "0039ec: 1f 20 03 d5"  | xxd -r - /opt/nvidia/deepstream/deepstream-6.3/lib/libnvds_meta.so 

Obviously for other versions of deepstream, the conditional branch jump offset may differ so the bytes might not be 55 01 00 34, and the offset is unlikely to still be 0x39ec)

For x86, the equivalent solution is to find the bytes for TEST EBX,EBX and JE 0x27 (85 db 74 25@ 0x5267) and replace them with a 4 byte NOP:

echo "005267: 0f 1f 40 00"  | xxd -r - /opt/nvidia/deepstream/deepstream-6.3/lib/libnvds_meta.so 

Deepstream 6.3 libnvds_meta_patched.zip (21.7 KB)

……

Have been monitoring this thread as I have also been running into this exact problem. Looks like its being left to die again. This is a huge problem that needs to be fixed. Can we please have this huge oversight looked into @fanzh

@jayden.elliott Hopefully the binary patches outlined in my post below can unblock you while we all get ignored.

1 Like

@vcmike I’m not even sure it would be any better if open sourced on GitHub… I’ve had an “accepted” PR open for the better part of a year…

But then we could maintain a fork with all these bugfixes and incorporate changes from people who actually try to use it.

Some ideas:

  • Fix this issue.
  • Some GstMessages to know the state of the NvInfer engine build.
  • Change the NvDsMeta structure and NvStreamMux to accept a %s rather than %u source so that we can use a UUID string.
  • +++
2 Likes

To keep this issue alive for a bit longer:

@fanzh (and also @Fiona.Chen and @yuweiw and @kesong and @yingliu and @junshengy)

I think the developers fixing this bug in deepstream 6.5 only is not really an acceptable resolution, since Nvidia is still selling TX2, TX2 NXs and Xaviers which can’t run anything beyond 6.0 or 6.3. You’d need to release a minor version for all previous versions since 6.0 to address this bug, or perhaps notify the >25 other people who opened tickets about the availability of patched binaries.

1 Like

sorry for the late reply! we are still investigating! will post here if having any update.
currently you can workaround it. here are some methods.

  1. if having tracker in pipeline, you might update child meta’s object_id first, then get the corresponding parent object by object_id after demux.
  2. you might find the corresponding parent meta by the object coordinate.

Those are both terrible workarounds for quite a straightforward bug - I don’t think I could make it any clearer what the root cause is:

  • newly added g_list_prepend in nvds_add_meta_to_parent - this is called every time frame meta is copied
  • this reverses the the order of object metas, so parents are no longer guaranteed to be before children in the meta list
  • The single pass hashmap build+lookup in nvds_copy_obj_meta_list assumes that parents are always before children, and if they’re not, the hashmap lookup will always fail and set all parents to NULL.

The solution is simple, either:

  • Never g_list_prepend - only append, or
  • Always do a 2-pass hashmap build+lookup, 1 loop to build it, the second to lookup parents

There isn’t anything left to investigate!

if it is fixed internally I will post here, then please wait for the new version. or please contact sales or after-sales for fix patch for the old version.