Nvidia-desktop kernel: [407343.357549] (NULL device *): nvhost_channelctl: invalid cmd 0x80685600

Hi,
We have tried some cases and check RAM usage in sudo tegrastats. Tried 4 decoding threads and do not observe memory increase:

int main()
{
    std::vector<pthread_t> pthreadIdArr;
-    for (int k = 0; k < 16; k++) {
+    for (int k = 0; k < 4; k++) {

For 8 encoding threads we can see the increase. And it also happens when using software decoder like:

 std::string pipeName = "rtspsrc location=rtsp://10.19.107.227:8554/test ! rtph264depay ! h264parse ! avdec_h264 ! videoconvert ! appsink ";

From the tests it looks like if the multiple decoding thread number exceeds a certain value, will see memory increase. It is not specific to using nvv4l2decoder. Not sure what the threshold value is, but in our tests, 4 threads are good and can pass overnight test.

Hi @DaneLLL ,

I tested rtsp stream by gst-launch, and got following debug information. It printed " /GstPipeline:pipeline0/GstXImageSink:ximagesink0: A lot of buffers are being dropped". Could you check if there existed some gst plugins or some buffers unreleased ?

0:00:00.072562774 26452 0x558a855410 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “rtspsrc”
0:00:00.077268423 26452 0x558a855410 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “rtspreal”
0:00:00.078568771 26452 0x558a855410 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “rtspwms”
0:00:00.080856827 26452 0x558a855410 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “rtph264depay”
0:00:00.082398678 26452 0x558a855410 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “h264parse”
0:00:00.145178700 26452 0x558a855410 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “nvv4l2decoder”
0:00:00.148040930 26452 0x558a855410 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “nvvidconv”
0:00:00.150499482 26452 0x558a855410 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “videoconvert”
0:00:00.153325585 26452 0x558a855410 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “ximagesink”
0:00:00.154023151 26452 0x558a855410 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “pipeline”
0:00:00.157126821 26452 0x558a855410 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “capsfilter”
Setting pipeline to PAUSED …
Opening in BLOCKING MODE
Pipeline is live and does not need PREROLL …
Progress: (open) Opening Stream
Progress: (connect) Connecting to rtsp://192.168.8.103/test1
Progress: (open) Retrieving server options
Progress: (open) Retrieving media info
Progress: (request) SETUP stream 0
0:00:00.390685682 26452 0x558a8f30a0 INFO GST_ELEMENT_FACTORY gstelementfactory.c:359:gst_element_factory_create: creating element “rtpbin” named “manager”
0:00:00.392078478 26452 0x558a8f30a0 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “rtpsession”
0:00:00.393338922 26452 0x558a8f30a0 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “rtpssrcdemux”
0:00:00.393740713 26452 0x558a8f30a0 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “rtpstorage”
0:00:00.394015720 26452 0x558a8f30a0 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “funnel”
0:00:00.394167591 26452 0x558a8f30a0 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “funnel”
Progress: (open) Opened Stream
Setting pipeline to PLAYING …
New clock: GstSystemClock
Progress: (request) Sending PLAY request
Progress: (request) Sending PLAY request
Progress: (request) Sent PLAY request
0:00:00.401354288 26452 0x558a8f30a0 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “rtpjitterbuffer”
0:00:00.402383405 26452 0x558a8f30a0 INFO GST_ELEMENT_FACTORY gstelementfactory.c:361:gst_element_factory_create: creating element “rtpptdemux”
NvMMLiteOpen : Block : BlockType = 261
NVMEDIA: Reading vendor.tegra.display-size : status: 6
NvMMLiteBlockCreate : Block : BlockType = 261
WARNING: from element /GstPipeline:pipeline0/GstXImageSink:ximagesink0: A lot of buffers are being dropped.
Additional debug info:
gstbasesink.c(2902): gst_base_sink_is_too_late (): /GstPipeline:pipeline0/GstXImageSink:ximagesink0:
There may be a timestamping problem, or this computer is too slow.
WARNING: from element /GstPipeline:pipeline0/GstXImageSink:ximagesink0: A lot of buffers are being dropped.
Additional debug info:
gstbasesink.c(2902): gst_base_sink_is_too_late (): /GstPipeline:pipeline0/GstXImageSink:ximagesink0:
There may be a timestamping problem, or this computer is too slow.

Hi,
Please set sync=0 to xvimagesink. For optimal perfomance in rendering, please use nv3dsink or nveglglessink

Hi @DaneLLL ,

Using gst-launch for rtsp, I just told you about some debug information, but it was necessary to set sync=true to reduce CPU consumption.

And it was really too high for CPU occupancy to pull 16 network camera streams !

Hi,
It is expected to have certain CPU usage whiling using OpenCV. Please check discussion in
[Gstreamer] nvvidconv, BGR as INPUT

Since BGR is not supported by hardware converter, need CPU to convert frame data to the format. For 16 cameras, the CPU usage is more significant.

Hi @DaneLLL ,

Do you have any idea about memory leak of 8 decoding threads ?

My app printed following debug information, and can you please tell us what did it do or why printed the info.

Mar 14 16:37:23 nvidia-desktop /usr/lib/gdm3/gdm-x-session[7944]: (–) NVIDIA(GPU-0): AOC 2270W (DFP-0): connected
Mar 14 16:37:23 nvidia-desktop /usr/lib/gdm3/gdm-x-session[7944]: (–) NVIDIA(GPU-0): AOC 2270W (DFP-0): External TMDS

Hi,
It seems to be an issue in gstreamer or OpenCV frameworks. By default is is gstreamer 1.14.5 and OpenCV 4.1.1. Since the L4T release is verified with the release versions, we would suggest keep the default versions and run less threads, However, if you must run more threads in single process, may consider to upgrade to later versions and give it a try.

For upgrading gstreamer, please refer to
https://docs.nvidia.com/jetson/archives/l4t-archived/l4t-3261/index.html#page/Tegra%20Linux%20Driver%20Package%20Development%20Guide/accelerated_gstreamer.html#wwpID0E06E0HA

Our default plugins are built with 1.14.5 and you shall need to rebuild the plugins. The source code is in
https://developer.nvidia.com/embedded/l4t/r32_release_v6.1/sources/t186/public_sources.tbz2

Please remember to clean cache after the upgrade:

$ rm .cache/gstreamer-1.0/registry.aarch64.bin

For upgrading OpenCV, may try this script from community contribution:
GitHub - mdegans/nano_build_opencv: Build OpenCV on Nvidia Jetson Nano

Hi @DaneLLL ,

Thanks a lot, and I’ll try it to upgrade.

I have a question, whether or not all the packages in [public_sources.tbz2] need be re-complied ?

Hi,
We would say it is better to rebuild all open-source plugins to avoid any possible mismatch.

I have compiled gstreamer1.20.0, and I adopted gst_parse_launch to pull rtsp network streams this time. But there was still memory leak when reopened rtsp stream.

Hi,
We try to run a gstreamer C sample and don’t observe memory increase. The sample can run overnight and exit gracefully. Here is the code:

#include <cstdlib>
#include <gst/gst.h>
#include <gst/app/gstappsink.h>
#include <glib-unix.h>

#include <sstream>
#include <vector>

using namespace std;

static void appsink_eos(GstAppSink * appsink, gpointer user_data)
{
    printf("app sink receive eos\n");
}

static GstFlowReturn new_buffer(GstAppSink *appsink, gpointer user_data)
{
    GstSample *sample = NULL;
    guint *p_frameCount = (guint *)user_data;

    g_signal_emit_by_name (appsink, "pull-sample", &sample,NULL);

    if (sample)
    {
        GstBuffer *buffer = NULL;
        GstCaps   *caps   = NULL;
        GstMapInfo map    = {0};

        caps = gst_sample_get_caps (sample);
        if (!caps)
        {
            printf("could not get snapshot format\n");
        }
        gst_caps_get_structure (caps, 0);
        buffer = gst_sample_get_buffer (sample);
        gst_buffer_map (buffer, &map, GST_MAP_READ);

        // print 60th frame to ensure the thread is alive
        (*p_frameCount)+=1;
        if ((*p_frameCount) == 60)
        {
          printf("map.size = %lu, 0x%p\n", map.size, appsink);
        }

        gst_buffer_unmap(buffer, &map);

        gst_sample_unref (sample);
    }
    else
    {
        g_print ("could not get sample\n");
    }

    return GST_FLOW_OK;
}

int main_1(void) {
    GMainLoop *main_loop;
    main_loop = g_main_loop_new (NULL, FALSE);
    GstPipeline *gst_pipeline = nullptr;
    string launch_string;
    guint frameCount = 0;
    ostringstream launch_stream;
    GstAppSinkCallbacks callbacks = {appsink_eos, NULL, new_buffer};

    launch_stream
    << "rtspsrc location=rtsp://10.19.107.227:8554/test ! "
    << "rtph264depay ! h264parse ! nvv4l2decoder enable-max-performance=1 ! "
    << "nvvidconv ! video/x-raw,width=1920,height=1080,format=RGBA ! "
    << "videoconvert ! video/x-raw,format=BGR ! "
    << "appsink name=mysink ";

    launch_string = launch_stream.str();

    frameCount = 0;
    GError *error = nullptr;
    gst_pipeline  = (GstPipeline*) gst_parse_launch(launch_string.c_str(), &error);

    if (gst_pipeline == nullptr) {
        g_print( "Failed to parse launch: %s\n", error->message);
        return -1;
    }
    if(error) g_error_free(error);

    GstElement *appsink_ = gst_bin_get_by_name(GST_BIN(gst_pipeline), "mysink");
    gst_app_sink_set_callbacks (GST_APP_SINK(appsink_), &callbacks, (gpointer)&frameCount, NULL);

    gst_element_set_state((GstElement*)gst_pipeline, GST_STATE_PLAYING); 
    sleep(3);

    //send EOS
    GstMessage *msg;
    gst_element_send_event ((GstElement*)gst_pipeline, gst_event_new_eos ());
    // Wait for EOS message
    GstBus *bus = gst_pipeline_get_bus(GST_PIPELINE(gst_pipeline));
    msg = gst_bus_poll(bus, GST_MESSAGE_EOS, GST_CLOCK_TIME_NONE);
    gst_message_unref(msg);
    gst_object_unref(bus);
    
    gst_element_set_state((GstElement*)gst_pipeline, GST_STATE_NULL);
    gst_object_unref(GST_OBJECT(gst_pipeline));
    g_main_loop_unref(main_loop);

    return 0;
}

void *thread_function(void *arg)
{
  for(guint i=0; i<5566; i++) {
    g_print("loop = %d \n", i);
    main_1();
    sleep(1);
  }    
  return nullptr;
}

int main(int argc, char** argv)
{
  gst_init (&argc, &argv);

  vector<pthread_t> pthreadIdArr;
  for (int k = 0; k < 8; k++) {
    pthread_t pthreadHandle;
    pthread_create(&pthreadHandle, nullptr, thread_function, nullptr);
    pthreadIdArr.push_back(pthreadHandle);
  }

  // wait all done
  for (size_t i = 0; i < pthreadIdArr.size(); i++) {
    pthread_join(pthreadIdArr[i], nullptr);
  }
  return 0;
}

So probably it is something wrong in OpenCV frameworks. Not sure if this works but may try not to use cv2.VideoCapture(), but map GstBuffer to cv::Mat like this:

            gst_buffer_map (buffer, &map, GST_MAP_READ);
            cv::Mat imgbuf = cv::Mat(height,
                                     width,
                                     CV_8UC3, map.data);

Hi @DaneLLL ,

The main function of my sample did not just open the rtsp stream and read frames, but the rtsp stream need to be closed and then reopen repeatedly because of the rtspsrc changed.

gsttest-20230217.tar.gz (2.2 KB)

I had push one rtsp network stream for this code in the attachment, but I still get memory leak. The memory fast increased 200M in 2mins.

My jetson version information also is in the attachment. Could you check again if you had memory leak when run the code in your machine? Please make sure there is one reliable rtsp stream for your test.