Failed to enqueue buffer in fulldims mode because binding idx 0

I am faceing this problem when I am using facenet model with batchsize 16. I built the model with trtexec

Error Message:

WARNING: Backend context bufferIdx(0) request dims:3x3x160x160 is out of range, [min: 16x3x160x160, max: 16x3x160x160]
ERROR: Failed to enqueue buffer in fulldims mode because binding idx: 0 with batchDims: 3x3x160x160 is not supported 
ERROR: Infer context enqueue buffer failed, nvinfer error:NVDSINFER_INVALID_PARAMS
0:00:04.127885935  7664     0x393ebcf0 WARN                 nvinfer gstnvinfer.cpp:1216:gst_nvinfer_input_queue_loop:<face-recogniser-inference> error: Failed to queue input batch for inferencing
Error: gst-stream-error-quark: Failed to queue input batch for inferencing (1): /dvs/git/dirty/git-master_linux/deepstream/sdk/src/gst-plugins/gst-nvinfer/gstnvinfer.cpp(1216): gst_nvinfer_input_queue_loop (): /GstPipeline:pipeline0/GstNvInfer:face-recogniser-inference

Config File:

[property]
gpu-id=0
process-mode=2

#net-scale-factor=0.00329215686274
net-scale-factor=0.0189601459307
offsets=112.86182266638355;112.86182266638355;112.86182266638355

#onnx-file=/home/jetson-nx/codes/models/facenet/v2_facenet_b16.onnx
model-engine-file=/home/jetson-nx/codes/models/facenet/v2_facenet_b16.plan
#force-implicit-batch-dim=1
batch-size=16
# 0=FP32 and 1=INT8 2=FP16 mode 
network-mode=2

gie-unique-id=3
operate-on-gie-id=2
operate-on-class-ids=0

is-classifier=1
classifier-async-mode=0

#infer-dims=3;160;160
#input-object-min-width=30
#input-object-min-height=30
model-color-format=1

output-tensor-meta=1

Hardware Platform: NX
DeepStream Version: 5.0.1
JetPack Version: 4.5
TensorRT Version: 7.1.3.0

Hi,

It seems that you feed a batchsize=3 buffer to the pipeline at runtime.

Do you launch the pipeline with deepstream-app?
If yes, could you also share the pipeline configure with us?

For example, in source30_1080p_dec_infer-resnet_tiled_display_int8.txt:

[application]
enable-perf-measurement=1
...

[primary-gie]
...
#property
batch-size=30

Thanks.

Thank you for replay

Sorry I forget to mention that I am using python

print("Adding elements to Pipeline \n")
	pipeline.add(tracker)
	pipeline.add(face_detector)
	pipeline.add(face_recogniser)
	pipeline.add(tiler)
	pipeline.add(nvvidconv)
	pipeline.add(filter1)
	pipeline.add(nvvidconv1)
	pipeline.add(nvosd)
	if is_aarch64():
		pipeline.add(transform)
	pipeline.add(sink)

	print("Linking elements in the Pipeline \n")
	streammux.link(face_detector) 
	face_detector.link(tracker)
	tracker.link(nvvidconv1)
	nvvidconv1.link(filter1)
	filter1.link(tiler)
	tiler.link(face_recogniser)
	face_recogniser.link(nvvidconv)
	nvvidconv.link(nvosd)
	if is_aarch64():
		nvosd.link(transform)
		transform.link(sink)
	else:
		nvosd.link(sink)

Hi,

Please check below sample for the multi-batch usage:
https://github.com/NVIDIA-AI-IOT/deepstream_python_apps/blob/master/apps/deepstream-test3/deepstream_test_3.py#L306

Could you try to set the streammux and pgie in the similar way to see if it works?

Thanks.

Thank you for help

It is same error

This is the main function code

def main(args):
# Check input arguments
# if len(args) < 2:
# 	sys.stderr.write("usage: %s <uri1> [uri2] ... [uriN] <folder to save frames>\n" % args[0])
# 	sys.exit(1)

for i in range(0,len(args)-1):
	fps_streams["stream{0}".format(i)]=GETFPS(i)
number_sources=len(args) -1

print("number_sources ",number_sources)
# global folder_name
# folder_name=args[-1]
# if path.exists(folder_name):
# 	sys.stderr.write("The output folder %s already exists. Please remove it first.\n" % folder_name)
# 	sys.exit(1)

# os.mkdir(folder_name)
# print("Frames will be saved in ",folder_name)
# Standard GStreamer initialization
GObject.threads_init()
Gst.init(None)

# Create gstreamer elements */
# Create Pipeline element that will form a connection of other elements
print("Creating Pipeline \n ")
pipeline = Gst.Pipeline()
is_live = False

if not pipeline:
	sys.stderr.write(" Unable to create Pipeline \n")
print("Creating streamux \n ")

# Create nvstreammux instance to form batches from one or more sources.
streammux = Gst.ElementFactory.make("nvstreammux", "Stream-muxer")
if not streammux:
	sys.stderr.write(" Unable to create NvStreamMux \n")

pipeline.add(streammux)
for i in range(number_sources):
	# os.mkdir(folder_name+"/stream_"+str(i))
	# frame_count["stream_"+str(i)]=0
	# saved_count["stream_"+str(i)]=0
	print("Creating source_bin ",i," \n ")
	uri_name=args[i+1]
	if uri_name.find("rtsp://") == 0 :
		is_live = True
	source_bin=create_source_bin(i, uri_name)
	if not source_bin:
		sys.stderr.write("Unable to create source bin \n")
	pipeline.add(source_bin)
	padname="sink_%u" %i
	sinkpad= streammux.get_request_pad(padname) 
	if not sinkpad:
		sys.stderr.write("Unable to create sink pad bin \n")
	srcpad=source_bin.get_static_pad("src")
	if not srcpad:
		sys.stderr.write("Unable to create src pad bin \n")
	srcpad.link(sinkpad)
print("Creating Pgie \n ")

tracker = Gst.ElementFactory.make("nvtracker", "tracker")
if not tracker:
	sys.stderr.write(" Unable to create tracker \n")

# Use nvinfer to run inferencing on decoder's output,
# behaviour of inferencing is set through config file
face_detector = Gst.ElementFactory.make("nvinfer", "face-detector-inference")
if not face_detector:
	sys.stderr.write(" Unable to create face_detector \n")

face_recogniser = Gst.ElementFactory.make("nvinfer", "face-recogniser-inference")
if not face_recogniser:
	sys.stderr.write(" Unable to create face_recogniser \n")

# Add nvvidconv1 and filter1 to convert the frames to RGBA
# which is easier to work with in Python.
print("Creating nvvidconv1 \n ")
nvvidconv1 = Gst.ElementFactory.make("nvvideoconvert", "convertor1")
if not nvvidconv1:
	sys.stderr.write(" Unable to create nvvidconv1 \n")
print("Creating filter1 \n ")
caps1 = Gst.Caps.from_string("video/x-raw(memory:NVMM), format=RGBA")
filter1 = Gst.ElementFactory.make("capsfilter", "filter1")
if not filter1:
	sys.stderr.write(" Unable to get the caps filter1 \n")
filter1.set_property("caps", caps1)
print("Creating tiler \n ")
tiler=Gst.ElementFactory.make("nvmultistreamtiler", "nvtiler")
if not tiler:
	sys.stderr.write(" Unable to create tiler \n")
print("Creating nvvidconv \n ")
nvvidconv = Gst.ElementFactory.make("nvvideoconvert", "convertor")
if not nvvidconv:
	sys.stderr.write(" Unable to create nvvidconv \n")
print("Creating nvosd \n ")
nvosd = Gst.ElementFactory.make("nvdsosd", "onscreendisplay")
if not nvosd:
	sys.stderr.write(" Unable to create nvosd \n")
if(is_aarch64()):
	print("Creating transform \n ")
	transform=Gst.ElementFactory.make("nvegltransform", "nvegl-transform")
	if not transform:
		sys.stderr.write(" Unable to create transform \n")

print("Creating EGLSink \n")
sink = Gst.ElementFactory.make("nveglglessink", "nvvideo-renderer")
if not sink:
	sys.stderr.write(" Unable to create egl sink \n")

if is_live:
	print("Atleast one of the sources is live")
	streammux.set_property('live-source', 1)

streammux.set_property('width', 1920)
streammux.set_property('height', 1080)
streammux.set_property('batch-size', number_sources)
streammux.set_property('batched-push-timeout', 4000000)
face_recogniser.set_property('config-file-path', "face_recogniser_config.txt")
face_detector.set_property('config-file-path', "face_detector_config.txt")
pgie_batch_size=face_detector.get_property("batch-size")

if(pgie_batch_size != number_sources):
	print("WARNING: Overriding infer-config batch-size",pgie_batch_size," with number of sources ", number_sources," \n")
	face_detector.set_property("batch-size",number_sources)
tiler_rows=int(math.sqrt(number_sources))
tiler_columns=int(math.ceil((1.0*number_sources)/tiler_rows))
tiler.set_property("rows",tiler_rows)
tiler.set_property("columns",tiler_columns)
tiler.set_property("width", TILED_OUTPUT_WIDTH)
tiler.set_property("height", TILED_OUTPUT_HEIGHT)

sink.set_property("sync", 0)

if not is_aarch64():
	# Use CUDA unified memory in the pipeline so frames
	# can be easily accessed on CPU in Python.
	mem_type = int(pyds.NVBUF_MEM_CUDA_UNIFIED)
	streammux.set_property("nvbuf-memory-type", mem_type)
	nvvidconv.set_property("nvbuf-memory-type", mem_type)
	nvvidconv1.set_property("nvbuf-memory-type", mem_type)
	tiler.set_property("nvbuf-memory-type", mem_type)

#Set properties of tracker
config = configparser.ConfigParser()
config.read('dstest2_tracker_config.txt')
config.sections()

for key in config['tracker']:
	if key == 'tracker-width' :
		tracker_width = config.getint('tracker', key)
		tracker.set_property('tracker-width', tracker_width)
	if key == 'tracker-height' :
		tracker_height = config.getint('tracker', key)
		tracker.set_property('tracker-height', tracker_height)
	if key == 'gpu-id' :
		tracker_gpu_id = config.getint('tracker', key)
		tracker.set_property('gpu_id', tracker_gpu_id)
	if key == 'll-lib-file' :
		tracker_ll_lib_file = config.get('tracker', key)
		tracker.set_property('ll-lib-file', tracker_ll_lib_file)
	if key == 'll-config-file' :
		tracker_ll_config_file = config.get('tracker', key)
		tracker.set_property('ll-config-file', tracker_ll_config_file)
	if key == 'enable-batch-process' :
		tracker_enable_batch_process = config.getint('tracker', key)
		tracker.set_property('enable_batch_process', tracker_enable_batch_process)

		

print("Adding elements to Pipeline \n")
pipeline.add(tracker)
pipeline.add(face_detector)
pipeline.add(face_recogniser)
pipeline.add(tiler)
pipeline.add(nvvidconv)
pipeline.add(filter1)
pipeline.add(nvvidconv1)
pipeline.add(nvosd)
if is_aarch64():
	pipeline.add(transform)
pipeline.add(sink)

print("Linking elements in the Pipeline \n")
streammux.link(face_detector) 
face_detector.link(tracker)
tracker.link(nvvidconv1)
nvvidconv1.link(filter1)
filter1.link(tiler)
tiler.link(face_recogniser)
face_recogniser.link(nvvidconv)
nvvidconv.link(nvosd)
if is_aarch64():
	nvosd.link(transform)
	transform.link(sink)
else:
	nvosd.link(sink)

# create an event loop and feed gstreamer bus mesages to it
loop = GObject.MainLoop()
bus = pipeline.get_bus()
bus.add_signal_watch()
bus.connect ("message", bus_call, loop)

tiler_sink_pad=tiler.get_static_pad("sink")
if not tiler_sink_pad:
	sys.stderr.write(" Unable to get src pad \n")
else:
	tiler_sink_pad.add_probe(Gst.PadProbeType.BUFFER, tiler_sink_pad_buffer_probe, 0)

vidconvsinkpad = nvvidconv.get_static_pad("sink")
if not vidconvsinkpad:
	sys.stderr.write(" Unable to get sink pad of nvvidconv \n")

vidconvsinkpad.add_probe(Gst.PadProbeType.BUFFER, sgie_sink_pad_buffer_probe, 0)


# List the sources
print("Now playing...")
for i, source in enumerate(args[:-1]):
	if (i != 0):
		print(i, ": ", source)

print("Starting pipeline \n")
# start play back and listed to events		
pipeline.set_state(Gst.State.PLAYING)
try:
	loop.run()
except:
	pass
# cleanup
print("Exiting app\n")
pipeline.set_state(Gst.State.NULL)

any help?

Hi,

We want to reproduce this issue in our environment.
Could you share a complete source and detailed steps with us?

Thanks.

I sent to you the zip file in private

thanks.

Any help?

Hi,

We try to check this issue but fails to download the zip file since no access.
Could you help us to enable it?

Thanks.

This is the code on github repo:
(https://github.com/xya22er/facenet_multistream.git)

And this is an onnx model file :
facenet_onnx_b16

Any help??

Hi,

Thanks.
We can get the source and model correctly.
Will update more information with you later.

Thanks
I will waiting you

I am still waiting

Hi,

When you build the TensorRT plan, could you set the min profile to 1x3x160x160?

WARNING: Backend context bufferIdx(0) request dims:3x3x160x160 is out of range, [min: 16x3x160x160, max: 16x3x160x160]

Thanks.

If I set it to 1x3x160x160, I can not use it as batch-size 16

Hi,

Sorry for the late update.
We try to reproduce your issue with the source shared in March 8.
But meet the following error. Is there any other parameter need to be updated?

ERROR: [TRT]: ../builder/cudnnBuilderBlockChooser.cpp (117) - Assertion Error in buildMemGraph: 0 (mg.nodes[mg.regionIndices[outputRegion]].size == mg.nodes[mg.regionIndices[inputRegion]].size)
ERROR: Build engine failed from config file
ERROR: failed to build trt engine.
...

Thanks.

Hi @AastaLLL ,

What does this error mean? I see the same error when I try to convert ONNX to TRT engine.

There are no errors in ONNX model, I used the ONNX checker to verify the ONNX model. Despite multiple changes done to my source code in PyTorch I keep getting this same error in the last layer of the neural network.

Error message:

[TensorRT] ERROR: …/builder/cudnnBuilderBlockChooser.cpp (117) - Assertion Error in buildMemGraph: 0 (mg.nodes[mg.regionIndices[outputRegion]].size == mg.nodes[mg.regionIndices[inputRegion]].size)

Please help me understand what exactly does this error mean? I don’t find an explanation for this error anywhere.

Hi msripooja,

Please help top open a new topic with more details. Thanks