We met an issue raised by nvvidconv.
When our app run our pipeline for the 1st time, nvvidconv sometimes reports calling NvBufferGetParams failed while NvBufferCreateEx is normal.
After replaying the same pipeline (same process), the issue is gone.
So we want to investigate why NvBufferGetParams interface call failed.
Our log snippet is as follow.
Could you please help here?
Thanks a lot!
2022-07-09 10:12:44.575787 0x7e78a6e770 nvvidconv ERROR: gst_nv_filter_memory_allocator_alloc: NvBufferGetParams Failed
2022-07-09 10:12:44.575835 0x7e78a6e770 GLIB ERROR: gst_nv_filter_buffer_pool_alloc_buffer: assertion ‘mem’ failed
2022-07-09 10:12:44.575865 0x7e78a6e770 bufferpool WARN : alloc function failed
2022-07-09 10:12:44.575891 0x7e78a6e770 bufferpool WARN : failed to allocate buffer
2022-07-09 10:12:44.575914 0x7e78a6e770 bufferpool ERROR: start failed
We have not seen issues about NvBufferGetParams(). Please make sure if the fd is valid in calling NvBufferGetParams(). And other parameters are valid. If you have confirmed all parameters are good, we would need your help to share a patch to nvvidconv and steps. So that we can follow the steps to replicate the issue and do investigation.
This post was flagged by the community and is temporarily hidden.
This post was flagged by the community and is temporarily hidden.
Please share a gst-launch-1.0 command or a sample so that we can run to replicate the issue. Can make a simple sample based on this:
appsrc link to nvvidconv error with reason not-negotiated(-4) - #6 by DaneLLL
Sorry I can’t provide a sample easily.
In our APP, we have many pipelines running in parallel.
We may add new pipelines or stop some of them from time to time.
Nvvidconv plugin is heavily used.
Also there are many buffer copy operations.
I suspect if we hit the NV-memory bottle-neck.
It can happen in different cases.
Besides the NvBufferGetParams failed, there could be other symptoms as well like NvBufferDestroy failed.
We just met similar issue using GDB.
Backtrace as follows:
#0 0x0000007fb7b884f8 in __GI_raise (sig=sig@entry=6) at …/sysdeps/unix/sysv/linux/raise.c:51
#1 0x0000007fb7b898d4 in __GI_abort () at abort.c:79
#2 0x0000007fb7bc268c in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7fb7c836f8 “%s\n”) at …/sysdeps/posix/libc_fatal.c:181
#3 0x0000007fb7bc8a04 in malloc_printerr (str=str@entry=0x7fb7c7f2c8 “free(): invalid pointer”) at malloc.c:5342
#4 0x0000007fb7bca2e8 in _int_free (av=0x7fb7ca9a70 <main_arena>, p=0x7e88014ed0, have_lock=0) at malloc.c:4167
#5 0x0000007f900069b4 in NvBufferDestroy () at /usr/lib/aarch64-linux-gnu/tegra/libnvbuf_utils.so.1.0.0
#6 0x0000007f4c1bb8b0 in () at /usr/lib/aarch64-linux-gnu/gstreamer-1.0/libgstnvv4l2camerasrc.so
#7 0x0000007fb7e9364c in _gst_memory_free (mem=0x7e34002e20) at gstmemory.c:97
#8 0x0000007fb7e59ad4 in gst_memory_unref (memory=) at …/gst/gstmemory.h:345
#9 0x0000007fb7e59ad4 in _gst_buffer_free (buffer=0x7e8003c690) at gstbuffer.c:749
#10 0x0000007fb7e5fbac in default_stop (pool=0x7eb40061d0) at gstbufferpool.c:414
#11 0x0000007fb7e5f4d0 in do_stop (pool=pool@entry=0x7eb40061d0) at gstbufferpool.c:432
#12 0x0000007fb7e604dc in gst_buffer_pool_set_active (pool=0x7eb40061d0, active=0) at gstbufferpool.c:540
#13 0x0000007f4c1bb200 in () at /usr/lib/aarch64-linux-gnu/gstreamer-1.0/libgstnvv4l2camerasrc.so
#14 0x0000007fa7d05ab0 in gst_base_src_stop (basesrc=basesrc@entry=0x7e2c00f870) at gstbasesrc.c:3651
#15 0x0000007fa7d0ced0 in gst_base_src_activate_push (pad=0x7e8c035b20, active=0, parent=0x7e2c00f870) at gstbasesrc.c:3794
#16 0x0000007fa7d0ced0 in gst_base_src_activate_mode (pad=0x7e8c035b20, parent=0x7e2c00f870, mode=GST_PAD_MODE_PUSH, active=0) at gstbasesrc.c:3866
And we saw following errors in terminal:
NvRmChannelSubmit: NvError_IoctlFailed with error code 22
NvRmPrivFlush: NvRmChannelSubmit failed (err = 196623, SyncPointIdx = 52, SyncPointValue = 0)
fence_set_name ioctl failed with 22
gst_nvvconv_transform: NvBufferTransform Failed
free(): invalid pointer
And traces in syslog:
Jul 15 01:06:20 linux kernel: [57287.485610] iommu_context_dev 13e10000.host1x:ctx0: pin_array_ids: could not get buf err=-22
Jul 15 01:06:20 linux kernel: [57287.485803] falcon 15340000.vic: nvhost_ioctl_channel_submit: failed with err -22
Jul 15 01:06:20 linux kernel: [57287.485991] falcon 15340000.vic: submit_add_gathers: failed to copy user inputs: class_ids=00000000c8a795a8 num_cmdbufs=2
Jul 15 01:06:20 linux kernel: [57287.486174] falcon 15340000.vic: nvhost_ioctl_channel_submit: failed with err -22
Jul 15 01:06:20 linux kernel: [57287.486756] (NULL device *): nvhost_sync_fence_set_name: failed to get fence
Jul 15 01:06:20 linux kernel: [57287.533065] iommu_context_dev 13e10000.host1x:ctx0: pin_array_ids: could not get buf err=-22
Jul 15 01:06:20 linux kernel: [57287.533262] falcon 15340000.vic: nvhost_ioctl_channel_submit: failed with err -22
Jul 15 01:06:20 linux kernel: [57287.533439] falcon 15340000.vic: submit_add_gathers: failed to copy user inputs: class_ids=000000008c083ed8 num_cmdbufs=2
Jul 15 01:06:20 linux kernel: [57287.533652] falcon 15340000.vic: nvhost_ioctl_channel_submit: failed with err -22
Jul 15 01:06:20 linux kernel: [57287.534549] (NULL device *): nvhost_sync_fence_set_name: failed to get fence
Jul 15 01:06:20 linux kernel: [57287.638336] iommu_context_dev 13e10000.host1x:ctx0: pin_array_ids: could not get buf err=-22
Jul 15 01:06:20 linux kernel: [57287.638532] falcon 15340000.vic: nvhost_ioctl_channel_submit: failed with err -22
Jul 15 01:06:20 linux kernel: [57287.638742] falcon 15340000.vic: submit_add_gathers: failed to copy user inputs: class_ids=000000008c2b0028 num_cmdbufs=2
Jul 15 01:06:20 linux kernel: [57287.638922] falcon 15340000.vic: nvhost_ioctl_channel_submit: failed with err -22
Jul 15 01:06:20 linux kernel: [57287.639103] (NULL device *): nvhost_sync_fence_set_name: failed to get fence
Any ideas how to debug further?
And everytime we hit the
The issue should not be in NvBufferGetParams(). More like memory leak in certain place and triggers the failure. Please run top and check if memory increases abnormally in the process.
I agree that it may has connection with memory leak.
But how can it cause NvBufferGetParams return error code even if valid fd passed?
And why we see vic error trace in kernel?
What does the error code -22 mean?
When memory leak happens, we generally see these prints. Here’s a similar topic:
Syncpts and threads leak using gstreamer plugin nvv4l2decoder
Looks like in this situation, certain memory buffer is not allocated successfully and some vic operations fail.
I suspect if this could be a Nvidia bug because we always see NvBufferGetParams failure.
But we never see NvBufferCreateEx failure.
What would happen if we request huge amount of NV memory in very short time while NV memory is not sufficient?
Will Nvidia allocate a dirty NV memory which is actually still being used?
If true, then it could explain what we saw.
I am trying to reduce NV memory needs.
There are many gst_buffer_make_writable and gst_buffer_copy calls in our code.
We expect shallow copy in these cases and the internal GstMemory should be shared.
But I found in source code of nvv4l2camerasrc and nvvidconv, GST_MEMORY_FLAG_NO_SHARE is set.
It would cause deep copy.
But if I simply remove this flag, it would hit dma_buf entry not found issue.
So what’t the correct way to avoid deep copy?
Thanks a lot.
If you suspect something wrong in NvBufferGetParams(), we would need your help to share a test sample so that we can replicate the issue. It is not easy for our teams to suggest next from the description. Would be great if we can setup developer kit and reproduce it.
I am sorry that I cannot share a sample easily now.
But I still have some questions.
I saw your comment on NvBufferSession from the post Fast copy of DMA buffers via NvBufferTransform - #3 by DaneLLL .
In our APP, nvvidconv is heavily used in different threads.
So is it still required that we should create NvBufferSession for each call to NvBufferTransform in nvvidconv?
There is code about NvBufferSession in nvvidconv plugin. Please take a look at source code. If NvBufferTransform() is called in multiple thread simultaneously, we suggest create it to have better performance.
Yes. I see it now.
But I notice in the source code of libgstnvvideo4linux2.so, there’s NvBufferTransform call in file gstv4l2bufferpool.c without NvBufferSession created.
Shall we create NvBufferSession here to improve performance?
Similar case for nvcompositor.
If you use the plugins in multiple threads, please add NvBufferSession and give it a try.