Nvidia-desktop kernel: [407343.357549] (NULL device *): nvhost_channelctl: invalid cmd 0x80685600

testCap.zip (1.5 KB)

Hi @DaneLLL ,

I have attached my sample, and I got memory increased as follows. I tested it in AGX, and nv release is “# R32 (release), REVISION: 6.1, GCID: 27863751, BOARD: t186ref, EABI: aarch64, DATE: Mon Jul 26 19:36:31 UTC 2021”.

Sincerely expect you could fix the memory increasing problem when gst-launch release by cv::VideoCapture::release.

.5C PMIC@50C AUX@32.5C CPU@36C thermal@33.25C
RAM 5420/7765MB (lfb 117x4MB) SWAP 2851/3883MB (cached 6MB) CPU [48%@1190,51%@1190,48%@1190,44%@1190,52%@1190,43%@1190] EMC_FREQ 0% GR3D_FREQ 0% AO@32.5C GPU@32.5C PMIC@50C AUX@33C CPU@35.5C thermal@33.9C
RAM 5431/7765MB (lfb 117x4MB) SWAP 2851/3883MB (cached 6MB) CPU [54%@1420,43%@1420,37%@1420,47%@1420,43%@1420,41%@1420] EMC_FREQ 0% GR3D_FREQ 0% AO@32.5C GPU@33C PMIC@50C AUX@33C CPU@36.5C thermal@33.9C


RAM 5609/7765MB (lfb 117x4MB) SWAP 2851/3883MB (cached 6MB) CPU [32%@1420,40%@1420,35%@1420,37%@1420,35%@1420,36%@1420] EMC_FREQ 0% GR3D_FREQ 22% AO@33C GPU@33.5C PMIC@50C AUX@33C CPU@36.5C thermal@34.55C
RAM 5610/7765MB (lfb 117x4MB) SWAP 2851/3883MB (cached 6MB) CPU [31%@1420,41%@1420,30%@1420,27%@1420,38%@1420,39%@1420] EMC_FREQ 0% GR3D_FREQ 44% AO@33.5C GPU@33.5C PMIC@50C AUX@33.5C CPU@36.5C thermal@34.55C
RAM 5612/7765MB (lfb 117x4MB) SWAP 2851/3883MB (cached 6MB) CPU [44%@1420,53%@1420,39%@1420,43%@1420,41%@1420,41%@1420] EMC_FREQ 0% GR3D_FREQ 45% AO@33.5C GPU@33.5C PMIC@50C AUX@33.5C CPU@36C thermal@34.55C
RAM 5613/7765MB (lfb 117x4MB) SWAP 2851/3883MB (cached 6MB) CPU [46%@1190,46%@1190,42%@1190,42%@1190,44%@1190,33%@1190] EMC_FREQ 0% GR3D_FREQ 0% AO@33C GPU@33.5C PMIC@50C AUX@33C CPU@36C thermal@34.25C

Hi, do you have some questions to compile the sample?

Hi,
Thanks for the sample. We would need some time to set up and test. Will update when there is further progress.

Hi, @DaneLLL , we have a project now, and it need this function to reduce the cpu occupancy of pulling rtsp network camera stream. It is urgently for us, and could you fix the problem more fast?
And we’ll be gratefully!

Hi,
We have tired but do not observe significant memory increase. How long would it take to observe the issue in your test? And it looks like OpenCV does not send EoS in cvCap.release(). Have seen some discussion in
writing file from Gstreamer pipleine in a VideoCapture on the Tx2 - OpenCV Q&A Forum

We would suggest run like this python sample:
Nvv4l2decoder sometimes fails to negotiate with downstream after several pipeline re-launches - #16 by DaneLLL
Sens EoS and wait for it before switching to NULL state. This seems impossible while running gstreamer pipeline in cv2.VideoCapture().

Hi,
There are two known issues in r32.6.1. Please apply the two steps:

  1. Apply the patch and rebuild/replace libgstnvvideo4linux2.so:
    Syncpts and threads leak using gstreamer plugin nvv4l2decoder - #11 by DaneLLL
  2. Rename libv4l2_nvargus.so as suggested in
    Memory Leak (Alloc/free mismatch) in Tegra multimedia API (encoder) - #6 by DaneLLL

Hi @DaneLLL ,
I tried again, and I still got memory increased about 80M every reopen 16 camera streams. Maybe you cound check if all the gst plugins were closed when reopen the pipeline. In fact, there existed memory leaks when reopen pipoline.

How long did you run the sample ? I just run about 20 mins.

Hi,
We have tried some cases and check RAM usage in sudo tegrastats. Tried 4 decoding threads and do not observe memory increase:

int main()
{
    std::vector<pthread_t> pthreadIdArr;
-    for (int k = 0; k < 16; k++) {
+    for (int k = 0; k < 4; k++) {

For 8 encoding threads we can see the increase. And it also happens when using software decoder like:

 std::string pipeName = "rtspsrc location=rtsp://10.19.107.227:8554/test ! rtph264depay ! h264parse ! avdec_h264 ! videoconvert ! appsink ";

From the tests it looks like if the multiple decoding thread number exceeds a certain value, will see memory increase. It is not specific to using nvv4l2decoder. Not sure what the threshold value is, but in our tests, 4 threads are good and can pass overnight test.

Hi @DaneLLL ,

I tested rtsp stream by gst-launch, and got following debug information. It printed " /GstPipeline:pipeline0/GstXImageSink:ximagesink0: A lot of buffers are being dropped". Could you check if there existed some gst plugins or some buffers unreleased ?

0:00:00.072562774 26452 0x558a855410 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “rtspsrc”
0:00:00.077268423 26452 0x558a855410 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “rtspreal”
0:00:00.078568771 26452 0x558a855410 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “rtspwms”
0:00:00.080856827 26452 0x558a855410 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “rtph264depay”
0:00:00.082398678 26452 0x558a855410 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “h264parse”
0:00:00.145178700 26452 0x558a855410 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “nvv4l2decoder”
0:00:00.148040930 26452 0x558a855410 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “nvvidconv”
0:00:00.150499482 26452 0x558a855410 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “videoconvert”
0:00:00.153325585 26452 0x558a855410 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “ximagesink”
0:00:00.154023151 26452 0x558a855410 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “pipeline”
0:00:00.157126821 26452 0x558a855410 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “capsfilter”
Setting pipeline to PAUSED …
Opening in BLOCKING MODE
Pipeline is live and does not need PREROLL …
Progress: (open) Opening Stream
Progress: (connect) Connecting to rtsp://192.168.8.103/test1
Progress: (open) Retrieving server options
Progress: (open) Retrieving media info
Progress: (request) SETUP stream 0
0:00:00.390685682 26452 0x558a8f30a0 INFO GST_ELEMENT_FACTORY gstelementfactory.c:359:gst_element_factory_create: creating element “rtpbin” named “manager”
0:00:00.392078478 26452 0x558a8f30a0 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “rtpsession”
0:00:00.393338922 26452 0x558a8f30a0 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “rtpssrcdemux”
0:00:00.393740713 26452 0x558a8f30a0 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “rtpstorage”
0:00:00.394015720 26452 0x558a8f30a0 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “funnel”
0:00:00.394167591 26452 0x558a8f30a0 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “funnel”
Progress: (open) Opened Stream
Setting pipeline to PLAYING …
New clock: GstSystemClock
Progress: (request) Sending PLAY request
Progress: (request) Sending PLAY request
Progress: (request) Sent PLAY request
0:00:00.401354288 26452 0x558a8f30a0 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “rtpjitterbuffer”
0:00:00.402383405 26452 0x558a8f30a0 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “rtpptdemux”
NvMMLiteOpen : Block : BlockType = 261
NVMEDIA: Reading vendor.tegra.display-size : status: 6
NvMMLiteBlockCreate : Block : BlockType = 261
WARNING: from element /GstPipeline:pipeline0/GstXImageSink:ximagesink0: A lot of buffers are being dropped.
Additional debug info:
gstbasesink.c(2902): gst_base_sink_is_too_late (): /GstPipeline:pipeline0/GstXImageSink:ximagesink0:
There may be a timestamping problem, or this computer is too slow.
WARNING: from element /GstPipeline:pipeline0/GstXImageSink:ximagesink0: A lot of buffers are being dropped.
Additional debug info:
gstbasesink.c(2902): gst_base_sink_is_too_late (): /GstPipeline:pipeline0/GstXImageSink:ximagesink0:
There may be a timestamping problem, or this computer is too slow.

Hi,
Please set sync=0 to xvimagesink. For optimal perfomance in rendering, please use nv3dsink or nveglglessink

Hi @DaneLLL ,

Using gst-launch for rtsp, I just told you about some debug information, but it was necessary to set sync=true to reduce CPU consumption.

And it was really too high for CPU occupancy to pull 16 network camera streams !

Hi,
It is expected to have certain CPU usage whiling using OpenCV. Please check discussion in
[Gstreamer] nvvidconv, BGR as INPUT

Since BGR is not supported by hardware converter, need CPU to convert frame data to the format. For 16 cameras, the CPU usage is more significant.

Hi @DaneLLL ,

Do you have any idea about memory leak of 8 decoding threads ?

My app printed following debug information, and can you please tell us what did it do or why printed the info.

Mar 14 16:37:23 nvidia-desktop /usr/lib/gdm3/gdm-x-session[7944]: (–) NVIDIA(GPU-0): AOC 2270W (DFP-0): connected
Mar 14 16:37:23 nvidia-desktop /usr/lib/gdm3/gdm-x-session[7944]: (–) NVIDIA(GPU-0): AOC 2270W (DFP-0): External TMDS

Hi,
It seems to be an issue in gstreamer or OpenCV frameworks. By default is is gstreamer 1.14.5 and OpenCV 4.1.1. Since the L4T release is verified with the release versions, we would suggest keep the default versions and run less threads, However, if you must run more threads in single process, may consider to upgrade to later versions and give it a try.

For upgrading gstreamer, please refer to
https://docs.nvidia.com/jetson/archives/l4t-archived/l4t-3261/index.html#page/Tegra%20Linux%20Driver%20Package%20Development%20Guide/accelerated_gstreamer.html#wwpID0E06E0HA

Our default plugins are built with 1.14.5 and you shall need to rebuild the plugins. The source code is in
https://developer.nvidia.com/embedded/l4t/r32_release_v6.1/sources/t186/public_sources.tbz2

Please remember to clean cache after the upgrade:

$ rm .cache/gstreamer-1.0/registry.aarch64.bin

For upgrading OpenCV, may try this script from community contribution:
GitHub - mdegans/nano_build_opencv: Build OpenCV on Nvidia Jetson Nano

Hi @DaneLLL ,

Thanks a lot, and I’ll try it to upgrade.

I have a question, whether or not all the packages in [public_sources.tbz2] need be re-complied ?

Hi,
We would say it is better to rebuild all open-source plugins to avoid any possible mismatch.

I have compiled gstreamer1.20.0, and I adopted gst_parse_launch to pull rtsp network streams this time. But there was still memory leak when reopened rtsp stream.

Hi,
We try to run a gstreamer C sample and don’t observe memory increase. The sample can run overnight and exit gracefully. Here is the code:

#include <cstdlib>
#include <gst/gst.h>
#include <gst/app/gstappsink.h>
#include <glib-unix.h>

#include <sstream>
#include <vector>

using namespace std;

static void appsink_eos(GstAppSink * appsink, gpointer user_data)
{
    printf("app sink receive eos\n");
}

static GstFlowReturn new_buffer(GstAppSink *appsink, gpointer user_data)
{
    GstSample *sample = NULL;
    guint *p_frameCount = (guint *)user_data;

    g_signal_emit_by_name (appsink, "pull-sample", &sample,NULL);

    if (sample)
    {
        GstBuffer *buffer = NULL;
        GstCaps   *caps   = NULL;
        GstMapInfo map    = {0};

        caps = gst_sample_get_caps (sample);
        if (!caps)
        {
            printf("could not get snapshot format\n");
        }
        gst_caps_get_structure (caps, 0);
        buffer = gst_sample_get_buffer (sample);
        gst_buffer_map (buffer, &map, GST_MAP_READ);

        // print 60th frame to ensure the thread is alive
        (*p_frameCount)+=1;
        if ((*p_frameCount) == 60)
        {
          printf("map.size = %lu, 0x%p\n", map.size, appsink);
        }

        gst_buffer_unmap(buffer, &map);

        gst_sample_unref (sample);
    }
    else
    {
        g_print ("could not get sample\n");
    }

    return GST_FLOW_OK;
}

int main_1(void) {
    GMainLoop *main_loop;
    main_loop = g_main_loop_new (NULL, FALSE);
    GstPipeline *gst_pipeline = nullptr;
    string launch_string;
    guint frameCount = 0;
    ostringstream launch_stream;
    GstAppSinkCallbacks callbacks = {appsink_eos, NULL, new_buffer};

    launch_stream
    << "rtspsrc location=rtsp://10.19.107.227:8554/test ! "
    << "rtph264depay ! h264parse ! nvv4l2decoder enable-max-performance=1 ! "
    << "nvvidconv ! video/x-raw,width=1920,height=1080,format=RGBA ! "
    << "videoconvert ! video/x-raw,format=BGR ! "
    << "appsink name=mysink ";

    launch_string = launch_stream.str();

    frameCount = 0;
    GError *error = nullptr;
    gst_pipeline  = (GstPipeline*) gst_parse_launch(launch_string.c_str(), &error);

    if (gst_pipeline == nullptr) {
        g_print( "Failed to parse launch: %s\n", error->message);
        return -1;
    }
    if(error) g_error_free(error);

    GstElement *appsink_ = gst_bin_get_by_name(GST_BIN(gst_pipeline), "mysink");
    gst_app_sink_set_callbacks (GST_APP_SINK(appsink_), &callbacks, (gpointer)&frameCount, NULL);

    gst_element_set_state((GstElement*)gst_pipeline, GST_STATE_PLAYING); 
    sleep(3);

    //send EOS
    GstMessage *msg;
    gst_element_send_event ((GstElement*)gst_pipeline, gst_event_new_eos ());
    // Wait for EOS message
    GstBus *bus = gst_pipeline_get_bus(GST_PIPELINE(gst_pipeline));
    msg = gst_bus_poll(bus, GST_MESSAGE_EOS, GST_CLOCK_TIME_NONE);
    gst_message_unref(msg);
    gst_object_unref(bus);
    
    gst_element_set_state((GstElement*)gst_pipeline, GST_STATE_NULL);
    gst_object_unref(GST_OBJECT(gst_pipeline));
    g_main_loop_unref(main_loop);

    return 0;
}

void *thread_function(void *arg)
{
  for(guint i=0; i<5566; i++) {
    g_print("loop = %d \n", i);
    main_1();
    sleep(1);
  }    
  return nullptr;
}

int main(int argc, char** argv)
{
  gst_init (&argc, &argv);

  vector<pthread_t> pthreadIdArr;
  for (int k = 0; k < 8; k++) {
    pthread_t pthreadHandle;
    pthread_create(&pthreadHandle, nullptr, thread_function, nullptr);
    pthreadIdArr.push_back(pthreadHandle);
  }

  // wait all done
  for (size_t i = 0; i < pthreadIdArr.size(); i++) {
    pthread_join(pthreadIdArr[i], nullptr);
  }
  return 0;
}

So probably it is something wrong in OpenCV frameworks. Not sure if this works but may try not to use cv2.VideoCapture(), but map GstBuffer to cv::Mat like this:

            gst_buffer_map (buffer, &map, GST_MAP_READ);
            cv::Mat imgbuf = cv::Mat(height,
                                     width,
                                     CV_8UC3, map.data);

Hi @DaneLLL ,

The main function of my sample did not just open the rtsp stream and read frames, but the rtsp stream need to be closed and then reopen repeatedly because of the rtspsrc changed.

gsttest-20230217.tar.gz (2.2 KB)

I had push one rtsp network stream for this code in the attachment, but I still get memory leak. The memory fast increased 200M in 2mins.

My jetson version information also is in the attachment. Could you check again if you had memory leak when run the code in your machine? Please make sure there is one reliable rtsp stream for your test.