queueInputBatch(): cudaMemcpyAsync for output buffers failed (cudaErrorLaunchFailure)

I am trying to port my ANPR (Automatic Number Plate Recognition) application to jetson. It consists of 3 stages (3 models)

  1. Plate detection in full frame (Modified Tiny-YoloV3 based darknet model)
  2. character detection in plates detected in first step (Modified Tiny-YoloV3 based single yolo layer darknet model)
  3. classification of characters detected in second stage.

Each of the models individually run fine but when i am trying to run them together in a single pipeline using Deepstream-app i get errors:

0:00:20.568228378 10565   0x5597ca5d90 ERROR                nvinfer gstnvinfer.cpp:511:gst_nvinfer_logger:<secondary_gie_0> NvDsInferContext[UID 2]:queueInputBatch(): cudaMemcpyAsync for output buffers failed (cudaErrorLaunchFailure)
0:00:20.568442857 10565   0x5597ca5d90 WARN                 nvinfer gstnvinfer.cpp:1098:gst_nvinfer_input_queue_loop:<secondary_gie_0> error: Failed to queue input batch for inferencing
ERROR from secondary_gie_0: Failed to queue input batch for inferencing
Debug info: gstnvinfer.cpp(1098): gst_nvinfer_input_queue_loop (): /GstPipeline:pipeline/GstBin:secondary_gie_bin/GstNvInfer:secondary_gie_0
0:00:20.568754471 10565   0x5597ca5d90 ERROR                nvinfer gstnvinfer.cpp:511:gst_nvinfer_logger:<secondary_gie_0> NvDsInferContext[UID 2]:queueInputBatch(): Failed to make stream wait on event(cudaErrorLaunchFailure)
0:00:20.568869315 10565   0x5597ca5d90 WARN                 nvinfer gstnvinfer.cpp:1098:gst_nvinfer_input_queue_loop:<secondary_gie_0> error: Failed to queue input batch for inferencing
0:00:20.568754888 10565   0x5597ca5a80 ERROR                nvinfer gstnvinfer.cpp:976:get_converted_buffer:<secondary_gie_0> cudaMemset2DAsync failed with error cudaErrorLaunchFailure while converting buffer
0:00:20.568978325 10565   0x5597ca5a80 WARN                 nvinfer gstnvinfer.cpp:1536:gst_nvinfer_process_objects:<secondary_gie_0> error: Buffer conversion failed
ERROR from secondary_gie_0: Failed to queue input batch for inferencing
Debug info: gstnvinfer.cpp(1098): gst_nvinfer_input_queue_loop (): /GstPipeline:pipeline/GstBin:secondary_gie_bin/GstNvInfer:secondary_gie_0
ERROR from secondary_gie_0: Buffer conversion failed
Debug info: gstnvinfer.cpp(1536): gst_nvinfer_process_objects (): /GstPipeline:pipeline/GstBin:secondary_gie_bin/GstNvInfer:secondary_gie_0

sometimes

0:00:13.787260776 15050   0x5594aa4d90 ERROR                nvinfer gstnvinfer.cpp:511:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:log(): cuda/cudaPoolingLayer.cpp (249) - Cuda Error in execute: 4 (unspecified launch failure)
0:00:13.790589265 15050   0x5594aa4a80 ERROR                nvinfer gstnvinfer.cpp:976:get_converted_buffer:<secondary_gie_0> cudaMemset2DAsync failed with error cudaErrorLaunchFailure while converting buffer
0:00:13.790663640 15050   0x5594aa4a80 WARN                 nvinfer gstnvinfer.cpp:1536:gst_nvinfer_process_objects:<secondary_gie_0> error: Buffer conversion failed
ERROR from secondary_gie_0: Buffer conversion failed
Debug info: gstnvinfer.cpp(1536): gst_nvinfer_process_objects (): /GstPipeline:pipeline/GstBin:secondary_gie_bin/GstNvInfer:secondary_gie_0
Quitting
0:00:13.814560984 15050   0x5594aa4d90 ERROR                nvinfer gstnvinfer.cpp:511:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:log(): cuda/cudaPoolingLayer.cpp (249) - Cuda Error in execute: 4 (unspecified launch failure)
0:00:13.814720463 15050   0x5594aa4d90 ERROR                nvinfer gstnvinfer.cpp:511:gst_nvinfer_logger:<secondary_gie_0> NvDsInferContext[UID 2]:queueInputBatch(): Failed to enqueue inference batch
0:00:13.814777911 15050   0x5594aa4d90 WARN                 nvinfer gstnvinfer.cpp:1098:gst_nvinfer_input_queue_loop:<secondary_gie_0> error: Failed to queue input batch for inferencing
ERROR from secondary_gie_0: Failed to queue input batch for inferencing
Debug info: gstnvinfer.cpp(1098): gst_nvinfer_input_queue_loop (): /GstPipeline:pipeline/GstBin:secondary_gie_bin/GstNvInfer:secondary_gie_0

very rarely i get this too:

unspecified launch failure in file yoloPlugins.cpp at line 107
line 107 in yoloPlugins.cpp contains

int YoloLayerV3::enqueue(
    int batchSize, const void* const* inputs, void** outputs, void* workspace,
    cudaStream_t stream)
{
    CHECK(cudaYoloLayerV3(
              inputs[0], outputs[0], batchSize, m_GridSizeX, m_GridSizeY, m_NumClasses, m_NumBoxes,
              m_OutputSize, stream));  //---> line 107
    return 0;
}

config file looks like:

# Copyright (c) 2019 NVIDIA Corporation.  All rights reserved.
#
# NVIDIA Corporation and its licensors retain all intellectual property
# and proprietary rights in and to this software, related documentation
# and any modifications thereto.  Any use, reproduction, disclosure or
# distribution of this software and related documentation without an express
# license agreement from NVIDIA Corporation is strictly prohibited.

[application]
enable-perf-measurement=1
perf-measurement-interval-sec=5
#gie-kitti-output-dir=streamscl

[tiled-display]
enable=1
rows=1
columns=1
width=1280
height=720
gpu-id=0
#(0): nvbuf-mem-default - Default memory allocated, specific to particular platform
#(1): nvbuf-mem-cuda-pinned - Allocate Pinned/Host cuda memory, applicable for Tesla
#(2): nvbuf-mem-cuda-device - Allocate Device cuda memory, applicable for Tesla
#(3): nvbuf-mem-cuda-unified - Allocate Unified cuda memory, applicable for Tesla
#(4): nvbuf-mem-surface-array - Allocate Surface Array memory, applicable for Jetson
nvbuf-memory-type=0

[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI 4=RTSP
type=3
uri=file:///home/user/Desktop/numberplate_1.mp4
num-sources=1
#drop-frame-interval=2
gpu-id=0
# (0): memtype_device   - Memory type Device
# (1): memtype_pinned   - Memory type Host Pinned
# (2): memtype_unified  - Memory type Unified
cudadec-memtype=0

[sink0]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File
type=2 #5
sync=1
source-id=0
gpu-id=0
qos=0
nvbuf-memory-type=0
overlay-id=1

[osd]
enable=1
gpu-id=0
border-width=1
text-size=15
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Serif
show-clock=0
clock-x-offset=800
clock-y-offset=820
clock-text-size=12
clock-color=1;0;0;0
nvbuf-memory-type=0

[streammux]
gpu-id=0
##Boolean property to inform muxer that sources are live
live-source=0
batch-size=1
##time out in usec, to wait after the first buffer is available
##to push the batch even if the complete batch is not formed
batched-push-timeout=40000
## Set muxer output width and height
width=1920
height=1080
##Enable to maintain aspect ratio wrt source, and allow black borders, works
##along with width, height properties
enable-padding=0
nvbuf-memory-type=0

# config-file property is mandatory for any gie section.
# Other properties are optional and if set will override the properties set in
# the infer config file.
[primary-gie]
enable=1
gpu-id=0
#Required by the app for OSD, not a plugin property
bbox-border-color0=1;0;0;1
bbox-border-color1=0;1;1;1
bbox-border-color2=0;0;1;1
bbox-border-color3=0;1;0;1
interval=0
gie-unique-id=1
nvbuf-memory-type=0
config-file=anpr_plate_det_gie_config.txt

[tracker]
enable=1
tracker-width=480
tracker-height=272
#ll-lib-file=/opt/nvidia/deepstream/deepstream-4.0/lib/libnvds_mot_iou.so
ll-lib-file=/opt/nvidia/deepstream/deepstream-4.0/lib/libnvds_mot_klt.so
#ll-config-file required for IOU only
#ll-config-file=iou_config.txt
gpu-id=0

[secondary-gie0]
enable=1
gpu-id=0
gie-unique-id=2
operate-on-gie-id=1
operate-on-class-ids=0
config-file=anpr_char_det_gie_config.txt


[secondary-gie1]
enable=1
gpu-id=0
gie-unique-id=3
operate-on-gie-id=2
operate-on-class-ids=0
config-file=anpr_char_rec_gie_config.txt

[tests]
file-loop=0

I am stuck as to what the error is.

Platform details:

  • NVIDIA Jetson NANO/TX1
    • Jetpack 4.2.1 [L4T 32.2.0]
    • CUDA GPU architecture 5.3
  • Libraries:
    • CUDA 10.0.326
    • cuDNN 7.5.0.56-1+cuda10.0
    • TensorRT 5.1.6.1-1+cuda10.0
    • Visionworks 1.6.0.500n
    • OpenCV 4.1.1 compiled CUDA: YES
  • Jetson Performance: active

deepstream-app version 4.0.1
DeepStreamSDK 4.0.1

Plate Detector config (anpr_plate_det_gie_config.txt):

[property]
gpu-id=0
net-scale-factor=1    
model-color-format=0
custom-network-config=plate_det.cfg
model-file=plate_det.weights
model-engine-file=model.engine
labelfile-path=plate_det.names
batch-size=1
#0=FP32, 1=INT8, 2=FP16 mode
network-mode=2
process-mode=1
network-type=0
num-detected-classes=1
gie-unique-id=1
maintain-aspect-ratio=1
interval=0
parse-bbox-func-name=NvDsInferParseCustomYoloV3Tiny
custom-lib-path=/opt/nvidia/deepstream/deepstream-4.0/sources/objectDetector_Yolo/nvdsinfer_custom_impl_Yolo_plate_det/libnvdsinfer_custom_impl_Yolo.so

[class-attrs-all]
threshold=0.4

Character Detector config (anpr_char_det_gie_config.txt):

[property]
gpu-id=0
net-scale-factor=1
#0=RGB, 1=BGR, 2=GRAY
model-color-format=0
custom-network-config=char_det.cfg
model-file=char_det.weights
model-engine-file=model_b16_fp32.engine
labelfile-path=char_det.names
batch-size=16
#0=FP32, 1=INT8, 2=FP16 mode
network-mode=0
process-mode=2
network-type=0
num-detected-classes=1
gie-unique-id=2
operate-on-gie-id=1
operate-on-class-ids=0
maintain-aspect-ratio=1
parse-bbox-func-name=NvDsInferParseCustomYoloV3Tiny
custom-lib-path=/opt/nvidia/deepstream/deepstream-4.0/sources/objectDetector_Yolo/nvdsinfer_custom_impl_Yolo_char_det/libnvdsinfer_custom_impl_Yolo.so

[class-attrs-all]
threshold=0.4

Character recognition config (anpr_char_rec_gie_config.txt):

[property]
gpu-id=0
net-scale-factor=1
uff-file=char_rec_model.uff
model-engine-file=char_rec_model.uff_b64_fp32.engine
labelfile-path=char_rec_labels.txt
batch-size=64
# 0=FP32 and 1=INT8 mode
network-mode=0
process-mode=2
network-type=1
#0=RGB, 1=BGR, 2=GRAY
model-color-format=2
gpu-id=0
gie-unique-id=3
operate-on-gie-id=2
operate-on-class-ids=0
is-classifier=1
uff-input-dims=1;32;32;0
uff-input-blob-name=conv2d_input
output-blob-names=dense_1/Softmax
classifier-async-mode=0
classifier-threshold=0.50

If more info required please ask.

Check the second detector,
“process-mode” should be 2 (second gie)
“network-type” should be 0. (do detection)

https://github.com/NVIDIA-AI-IOT/deepstream_reference_apps/blob/master/back-to-back-detectors/secondary_detector_config.txt

i have already used these parameters. i have added config too please have a look .

Hi Pawany16,
Since the three models can run well individually and the CUDA error is cudaErrorLaunchFailure, I seriouly suspect it’s bcause the NANO GPU is too intensive to run all these model in parallel.

Maybe you could reduce the batch_size of Character Detector and Character recognition and check if this issue get mitigated. If it does, you have to upgrade the powerful platform, e.g. NX to run all these models.

Thanks!

ok i’ll try reducing batch_size

I tried reducing the batch_size but it still gives me same errors. And currently i am unable to procure a more powerful platform.

What batch did you try?
I do think this is caused by the intensive GPU resource. Sorry! I don’t think there is other solution for this failure.

I tried with batch_size = 1.

Hi @pawany16,
Sorry for delay!
So, you need to modify the back to back detector app and not the deepstream-app since deepstream-app does not support back to back detectors, as your pipeline looks like [pgie1->pgie2->sgie] which can’t be constructed in the deepstream-app.

back-to-back sample: deepstream_reference_apps/back-to-back-detectors at master · NVIDIA-AI-IOT/deepstream_reference_apps · GitHub

Currently you are using sgie config for the second detector which is incorrect.

Thanks!

Hi mchi ,
I modified the back to back detector and tried but i am still getting the same errors.

Errors:

0:00:23.606307647 13508   0x55a4aac4f0 ERROR                nvinfer gstnvinfer.cpp:511:gst_nvinfer_logger:<char-detector> NvDsInferContext[UID 2]:queueInputBatch(): Failed to record cuda event (cudaErrorLaunchFailure)
0:00:23.606375772 13508   0x55a4aac4f0 WARN                 nvinfer gstnvinfer.cpp:1098:gst_nvinfer_input_queue_loop:<char-detector> error: Failed to queue input batch for inferencing
0:00:23.606581084 13508   0x55a4aac4f0 ERROR                nvinfer gstnvinfer.cpp:511:gst_nvinfer_logger:<char-detector> NvDsInferContext[UID 2]:queueInputBatch(): Failed to make stream wait on event(cudaErrorLaunchFailure)
0:00:23.606639470 13508   0x55a4aac4f0 WARN                 nvinfer gstnvinfer.cpp:1098:gst_nvinfer_input_queue_loop:<char-detector> error: Failed to queue input batch for inferencing
0:00:23.606750147 13508   0x55a4aac4f0 ERROR                nvinfer gstnvinfer.cpp:511:gst_nvinfer_logger:<char-detector> NvDsInferContext[UID 2]:queueInputBatch(): Failed to make stream wait on event(cudaErrorLaunchFailure)
0:00:23.606802230 13508   0x55a4aac4f0 WARN                 nvinfer gstnvinfer.cpp:1098:gst_nvinfer_input_queue_loop:<char-detector> error: Failed to queue input batch for inferencing
ERROR from element char-detector: Failed to queue input batch for inferencing
Error details: gstnvinfer.cpp(1098): gst_nvinfer_input_queue_loop (): /GstPipeline:anprtest1-pipeline/GstNvInfer:char-detector
Returned, stopping playback
0:00:23.607497907 13508   0x55a4aac4f0 ERROR                nvinfer gstnvinfer.cpp:511:gst_nvinfer_logger:<char-detector> NvDsInferContext[UID 2]:queueInputBatch(): Failed to make stream wait on event(cudaErrorLaunchFailure)
0:00:23.607533011 13508   0x55a4aac4f0 WARN                 nvinfer gstnvinfer.cpp:1098:gst_nvinfer_input_queue_loop:<char-detector> error: Failed to queue input batch for inferencing

here is my code

/*
 * Copyright (c) 2018-2019, NVIDIA CORPORATION. All rights reserved.
 *
 * Permission is hereby granted, free of charge, to any person obtaining a
 * copy of this software and associated documentation files (the "Software"),
 * to deal in the Software without restriction, including without limitation
 * the rights to use, copy, modify, merge, publish, distribute, sublicense,
 * and/or sell copies of the Software, and to permit persons to whom the
 * Software is furnished to do so, subject to the following conditions:
 *
 * The above copyright notice and this permission notice shall be included in
 * all copies or substantial portions of the Software.
 *
 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
 * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
 * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
 * DEALINGS IN THE SOFTWARE.
 */

#include <gst/gst.h>
#include <glib.h>

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "gstnvdsmeta.h"
 
#define PLATE_DET_GIE_CONFIG_FILE  "plate_det_gie_config.txt"
#define CHAR_DET_GIE_CONFIG_FILE "char_det_gie_config.txt"
#define CHAR_REC_GIE_CONFIG_FILE "char_rec_gie_config.txt"
#define MAX_DISPLAY_LEN 64

/* The muxer output resolution must be set if the input streams will be of
 * different resolution. The muxer will scale all the input frames to this
 * resolution. */
#define MUXER_OUTPUT_WIDTH 1920
#define MUXER_OUTPUT_HEIGHT 1080

/* Muxer batch formation timeout, for e.g. 40 millisec. Should ideally be set
 * based on the fastest source's framerate. */
#define MUXER_BATCH_TIMEOUT_USEC 4000000

gint frame_number = 0;

guint plate_det_gie_unique_id = 1;
guint char_det_gie_unique_id = 2;
guint char_rec_gie_unique_id = 3;


GMainLoop *loop = NULL;
GstElement *pipeline = NULL, *source_bin = NULL, *streammux = NULL, *sink = NULL, 
  *plate_det_gie = NULL, *nvvidconv = NULL, *nvosd = NULL, *char_det_gie = NULL, *char_rec_gie = NULL;

/* This is the buffer probe function that we have registered on the sink pad
 * of the OSD element. All the infer elements in the pipeline shall attach
 * their metadata to the GstBuffer, here we will iterate & process the metadata
 * for ex: class ids to strings, counting of class_id objects etc. */
static GstPadProbeReturn
osd_sink_pad_buffer_probe (GstPad * pad, GstPadProbeInfo * info,
    gpointer u_data)
{
    GstBuffer *buf = (GstBuffer *) info->data;
    guint plate_count = 0;    

    NvDsBatchMeta *batch_meta = gst_buffer_get_nvds_batch_meta (buf);

    for (NvDsMetaList* l_frame = batch_meta->frame_meta_list; l_frame != NULL; l_frame = l_frame->next) {
        NvDsFrameMeta *frame_meta = (NvDsFrameMeta *) (l_frame->data);
        int offset = 0;
        for (NvDsMetaList* l_obj = frame_meta->obj_meta_list; l_obj != NULL; l_obj = l_obj->next) {
            NvDsObjectMeta* obj_meta = (NvDsObjectMeta *) (l_obj->data);
            if (obj_meta->unique_component_id == 1) {
                plate_count++;
            }            
        }
    }       
    
    g_print ("Frame Number = %d Number of plates = %d \n",
            frame_number, plate_count);
    frame_number++;
    return GST_PAD_PROBE_OK;
}

static gboolean
bus_call (GstBus * bus, GstMessage * msg, gpointer data)
{
  GMainLoop *loop = (GMainLoop *) data;
  switch (GST_MESSAGE_TYPE (msg)) {
    case GST_MESSAGE_EOS:
      g_print ("End of stream\n");
      g_main_loop_quit (loop);
      break;
    case GST_MESSAGE_ERROR:{
      gchar *debug;
      GError *error;
      gst_message_parse_error (msg, &error, &debug);
      g_printerr ("ERROR from element %s: %s\n",
          GST_OBJECT_NAME (msg->src), error->message);
      if (debug)
        g_printerr ("Error details: %s\n", debug);
      g_free (debug);
      g_error_free (error);
      g_main_loop_quit (loop);
      break;
    }
    default:
      break;
  }
  return TRUE;
}

/* Tracker config parsing */

#define CHECK_ERROR(error) \
    if (error) { \
        g_printerr ("Error while parsing config file: %s\n", error->message); \
        goto done; \
    }

static void
decodebin_child_added (GstChildProxy * child_proxy, GObject * object,
    gchar * name, gpointer user_data)
{
  g_print ("decodebin child added %s\n", name);
  if (g_strrstr (name, "decodebin") == name) {
    g_signal_connect (G_OBJECT (object), "child-added",
        G_CALLBACK (decodebin_child_added), user_data);
  }
  if (g_strrstr (name, "nvv4l2decoder") == name) {
#ifdef PLATFORM_TEGRA
    g_object_set (object, "enable-max-performance", TRUE, NULL);
    g_object_set (object, "bufapi-version", TRUE, NULL);
    g_object_set (object, "drop-frame-interval", 0, NULL);
    g_object_set (object, "num-extra-surfaces", 0, NULL);
#else
    g_object_set (object, "gpu-id", GPU_ID, NULL);
#endif
  }
}

static void
cb_newpad (GstElement * decodebin, GstPad * pad, gpointer data)
{
  GstCaps *caps = gst_pad_query_caps (pad, NULL);
  const GstStructure *str = gst_caps_get_structure (caps, 0);
  const gchar *name = gst_structure_get_name (str);

  g_print ("decodebin new pad %s\n", name);
  if (!strncmp (name, "video", 5)) {
    gchar pad_name[16] = { 0 };
    GstPad *sinkpad = NULL;
    g_snprintf (pad_name, 15, "sink_%u", 0);
    sinkpad = gst_element_get_request_pad (streammux, pad_name);
    if (gst_pad_link (pad, sinkpad) != GST_PAD_LINK_OK) {
      g_print ("Failed to link decodebin to pipeline\n");
    } else {
      g_print ("Decodebin linked to pipeline\n");
    }
    gst_object_unref (sinkpad);
  }
}

static GstElement* create_uridecode_bin(gchar *filename) {
  GstElement *bin;
  g_print("creating uridecodebin for [%s]\n", filename);
  bin = gst_element_factory_make("uridecodebin", "decode_bin");
  g_object_set (G_OBJECT (bin), "uri", filename, NULL);
  g_signal_connect(G_OBJECT(bin), "pad-added", G_CALLBACK(cb_newpad), NULL);
  g_signal_connect(G_OBJECT(bin), "child-added", G_CALLBACK(decodebin_child_added), NULL);
  g_print("created uridecodebin\n");
  return bin;
}

int main (int argc, char *argv[])
{
#ifdef PLATFORM_TEGRA
  GstElement *transform = NULL;
  g_print ("Platform TEGRA \n");
#endif
  GstBus *bus = NULL;
  guint bus_watch_id = 0;
  GstPad *osd_sink_pad = NULL;

  /* Check input arguments */
  if (argc != 2) {
    g_printerr ("Usage: %s <filename>\n", argv[0]);
    return -1;
  }

  /* Standard GStreamer initialization */
  gst_init (&argc, &argv);
  loop = g_main_loop_new (NULL, FALSE);

  /* Create gstreamer elements */

  /* Create Pipeline element that will be a container of other elements */
  pipeline = gst_pipeline_new ("anprtest1-pipeline");

  /* Source element for reading from the file */
  source_bin = create_uridecode_bin(argv[1]);

  /* Create nvstreammux instance to form batches from one or more sources. */
  streammux = gst_element_factory_make ("nvstreammux", "stream-muxer");

  if (!pipeline || !streammux) {
    g_printerr ("One element could not be created. Exiting.\n");
    return -1;
  }

  /* Use nvinfer to run inferencing on decoder's output,
   * behaviour of inferencing is set through config file */
  plate_det_gie = gst_element_factory_make ("nvinfer", "plate-detector");
  g_print("created plate detector\n");

  char_det_gie = gst_element_factory_make ("nvinfer", "char-detector");
  g_print("created char detector\n");

  char_rec_gie = gst_element_factory_make ("nvinfer", "char-recognition");
  g_print("created char recognition\n");

  /* Use convertor to convert from NV12 to RGBA as required by nvosd */
  nvvidconv = gst_element_factory_make ("nvvideoconvert", "nvvideo-converter");

  /* Create OSD to draw on the converted RGBA buffer */
  nvosd = gst_element_factory_make ("nvdsosd", "nv-onscreendisplay");

  /* Finally render the osd output */
#ifdef PLATFORM_TEGRA
  transform = gst_element_factory_make ("nvegltransform", "nvegl-transform");
#endif
  sink = gst_element_factory_make ("nveglglessink", "nvvideo-renderer");

  if (!source_bin || !plate_det_gie || !char_det_gie || !char_rec_gie || !nvvidconv || !nvosd || !sink) {
    g_printerr ("One element could not be created. Exiting.\n");
    return -1;
  }

#ifdef PLATFORM_TEGRA
  if(!transform) {
    g_printerr ("One tegra element could not be created. Exiting.\n");
    return -1;
  }
#endif

  gst_bin_add (GST_BIN (pipeline), source_bin);

  g_object_set (G_OBJECT (streammux), "width", MUXER_OUTPUT_WIDTH, "height",
      MUXER_OUTPUT_HEIGHT, "batch-size", 1,
      "batched-push-timeout", MUXER_BATCH_TIMEOUT_USEC, NULL);

  /* Set all the necessary properties of the nvinfer element,
   * the necessary ones are : */
  g_object_set (G_OBJECT (plate_det_gie), "config-file-path", PLATE_DET_GIE_CONFIG_FILE, NULL);
  g_object_set (G_OBJECT (char_det_gie), "config-file-path", CHAR_DET_GIE_CONFIG_FILE, NULL);
  g_object_set (G_OBJECT (char_rec_gie), "config-file-path", CHAR_REC_GIE_CONFIG_FILE, NULL);

  /* we add a message handler */
  bus = gst_pipeline_get_bus (GST_PIPELINE (pipeline));
  bus_watch_id = gst_bus_add_watch (bus, bus_call, loop);
  gst_object_unref (bus);

  /* Set up the pipeline */
  /* we add all elements into the pipeline */  
  gst_bin_add_many (GST_BIN (pipeline), streammux, plate_det_gie, char_det_gie, char_rec_gie,
      nvvidconv, nvosd, sink, NULL);
#ifdef PLATFORM_TEGRA
  gst_bin_add(GST_BIN (pipeline), transform);
#endif

#ifdef PLATFORM_TEGRA
  if (!gst_element_link_many (streammux, plate_det_gie, char_det_gie, char_rec_gie, 
      nvvidconv, nvosd, transform, sink, NULL)) {
    g_printerr ("Elements could not be linked. Exiting.\n");
    return -1;
  }
#else
  if (!gst_element_link_many (streammux, plate_det_gie, char_det_gie,
      char_rec_gie, nvvidconv, nvosd, sink, NULL)) {
    g_printerr ("Elements could not be linked. Exiting.\n");
    return -1;
  }
  // if (!gst_element_link_many (streammux, plate_det_gie, nvvidconv, nvosd, sink, NULL)) {
  //   g_printerr ("Elements could not be linked. Exiting.\n");
  //   return -1;
  // }  
  //g_print("=====> Else condition run, link method not used here \n");
#endif
  g_print("linked elements\n");

  /* Lets add probe to get informed of the meta data generated, we add probe to
   * the sink pad of the osd element, since by that time, the buffer would have
   * had got all the metadata. */
  osd_sink_pad = gst_element_get_static_pad (nvosd, "sink");
  if (!osd_sink_pad)
    g_print ("Unable to get sink pad\n");
  else
    gst_pad_add_probe (osd_sink_pad, GST_PAD_PROBE_TYPE_BUFFER,
        osd_sink_pad_buffer_probe, NULL, NULL);
    g_print ("Added probe to osd sink pad\n");

  //g_object_set (G_OBJECT (sink), "sync", FALSE, "qos", FALSE, NULL);

  //gst_element_set_state (pipeline, GST_STATE_PAUSED);

  /* Set the pipeline to "playing" state */
  g_print ("Now playing: %s\n", argv[1]);
  gst_element_set_state (pipeline, GST_STATE_PLAYING);

  /* Iterate */
  g_print ("Running...\n");
  g_main_loop_run (loop);

  /* Out of the main loop, clean up nicely */
  g_print ("Returned, stopping playback\n");
  gst_element_set_state (pipeline, GST_STATE_NULL);
  g_print ("Deleting pipeline\n");
  gst_object_unref (GST_OBJECT (pipeline));
  g_source_remove (bus_watch_id);
  g_main_loop_unref (loop);
  return 0;
}

If you need i can share the models with you.

Hi @pawany16
is it possible share us a complete repo so that we can take a quick try on different Jetson platforms?

Thanks!

Hi mchi,

i cant share the repo publicly but i am direct messaging you the link for it.

Hi mchi,

It’s been quite a long time haven’t heard from you, were you able to run the repo that i shared with you ??

Hi @pawany16,
Really sorry for the long delay!
My colleague did try your application, I thought she replied this. We originally wanted to try your application on a powerful platform - TX2 or Xavier, but she said there are some compiling failures.
And, we found DeepStreamSDK 4.0.1 is for Jetpack4.2.2, and DeepStreamSDK 4.0 is for Jetpack4.2.1.
Is it possible for you to upgrade to DeepStream4.0.2 witj Jetpack4.3 to try again?

Sorry && Thanks!

Hi pawany16,

Have you tried with our suggestions to upgrade to DeepStream4.0.2 witj Jetpack4.3 to try again? Any result can be shared?

Hi kayccc,

Due to the covid situtation i didn’t have access to the device, but i have it now and i’ll try it today and let you know the results probably tomorrow.

Hi

I tried on DeepStream4.0.2 with Jetpack4.3 and i am still getting the errors.

Platform details:

  • NVIDIA Jetson Nano (Developer Kit Version)
    • Jetpack 4.3 [L4T 32.3.1]
    • NV Power Mode: MAXN - Type: 0
    • jetson_clocks service: active
  • Libraries:
    • CUDA: 10.0.326
    • cuDNN: 7.6.3.28
    • TensorRT: 6.0.1.10
    • Visionworks: 1.6.0.500n
    • OpenCV: 4.1.1 compiled CUDA: NO
    • VPI: 0.1.0
    • Vulkan: 1.1.70

deepstream-app version 4.0.2
DeepStreamSDK 4.0.2

Errors:

0:00:27.466234333 12200   0x55cd22b800 ERROR                nvinfer gstnvinfer.cpp:511:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:queueInputBatch(): cudaMemcpyAsync for output buffers failed (cudaErrorLaunchFailure)
0:00:27.466292222 12200   0x55cd22b800 WARN                 nvinfer gstnvinfer.cpp:1098:gst_nvinfer_input_queue_loop:<primary_gie_classifier> error: Failed to queue input batch for inferencing
0:00:27.466519770 12200   0x55cd22b800 ERROR                nvinfer gstnvinfer.cpp:511:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:queueInputBatch(): Failed to make stream wait on event(cudaErrorLaunchFailure)
0:00:27.466573078 12200   0x55cd22b800 WARN                 nvinfer gstnvinfer.cpp:1098:gst_nvinfer_input_queue_loop:<primary_gie_classifier> error: Failed to queue input batch for inferencing
0:00:27.466651114 12200   0x55cd22b800 ERROR                nvinfer gstnvinfer.cpp:511:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:queueInputBatch(): Failed to make stream wait on event(cudaErrorLaunchFailure)
0:00:27.466696197 12200   0x55cd22b800 WARN                 nvinfer gstnvinfer.cpp:1098:gst_nvinfer_input_queue_loop:<primary_gie_classifier> error: Failed to queue input batch for inferencing
ERROR from primary_gie_classifier: Failed to queue input batch for inferencing

Hi @pawany16,
three suggestions:

  1. can you try this on powerful platform, e.g. Xavier or dGPU+x86 system?
  2. can set all three model to bacth==1 and check if this issue is still reproducibled
  3. run cuda-mmecheck to check the CUDA memory access

Thanks!