ERROR: nvdsinfer_backend.cpp:472 Failed to enqueue buffer in fulldims mode because binding idx: 0 with batchDims: 1x96x224x224 is not supported

Please provide the following information when requesting support.

• Hardware ( NVIDIA GeForce RTX 3060 Laptop GPU)
• Network Type (ActionRecognitionNet)

I have completed 2d and 3d action recognition model training using TAO 3.0. Later tried to run the model on deepstream-6.0. I have edited the model path and lables.txt. When I tried to run the application, I’m encountered with this error-
divya@divya-GF65-Thin-10UE:/opt/nvidia/deepstream/deepstream-6.1/sources/apps/sample_apps/deepstream-3d-action-recognition$ ./deepstream-3d-action-recognition -c deepstream_action_recognition_config.txt
num-sources = 2
Now playing: file:///home/divya/Downloads/ride.mp4, file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_ride_bike.mov,
WARNING: …/nvdsinfer/nvdsinfer_model_builder.cpp:1482 Deserialize engine failed because file path: /opt/nvidia/deepstream/deepstream-6.1/sources/apps/sample_apps/deepstream-3d-action-recognition/./resnet18_2d_rgb_hmdb5_32.etlt_b4_gpu0_fp16.engine open error
0:00:00.761201560 4773 0x55baf3d75320 WARN nvinfer gstnvinfer.cpp:643:gst_nvinfer_logger: NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1888> [UID = 1]: deserialize engine from file :/opt/nvidia/deepstream/deepstream-6.1/sources/apps/sample_apps/deepstream-3d-action-recognition/./resnet18_2d_rgb_hmdb5_32.etlt_b4_gpu0_fp16.engine failed
0:00:00.775030852 4773 0x55baf3d75320 WARN nvinfer gstnvinfer.cpp:643:gst_nvinfer_logger: NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1993> [UID = 1]: deserialize backend context from engine from file :/opt/nvidia/deepstream/deepstream-6.1/sources/apps/sample_apps/deepstream-3d-action-recognition/./resnet18_2d_rgb_hmdb5_32.etlt_b4_gpu0_fp16.engine failed, try rebuild
0:00:00.775042336 4773 0x55baf3d75320 INFO nvinfer gstnvinfer.cpp:646:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1914> [UID = 1]: Trying to create engine from model files
WARNING: [TRT]: onnx2trt_utils.cpp:366: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
0:00:17.183637061 4773 0x55baf3d75320 INFO nvinfer gstnvinfer.cpp:646:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1946> [UID = 1]: serialize cuda engine to file: /home/divya/computer_vision/cv_samples_v1.4.0/action_recognition_net/results/export_2d/rgb_resnet18_2.etlt_b1_gpu0_fp16.engine successfully
INFO: …/nvdsinfer/nvdsinfer_model_builder.cpp:610 [Implicit Engine Info]: layers num: 2
0 INPUT kFLOAT input_rgb 9x224x224
1 OUTPUT kFLOAT fc_pred 2

0:00:17.199869270 4773 0x55baf3d75320 INFO nvinfer gstnvinfer_impl.cpp:328:notifyLoadModelStatus: [UID 1]: Load new model:config_infer_primary_2d_action.txt sucessfully
sequence_image_process.cpp:499, [INFO: CUSTOM_LIB] 2D custom sequence network shape NSHW[4, 96, 224, 224], reshaped as [N: 4, C: 3, S:32, H: 224, W:224]
sequence_image_process.cpp:522, [INFO: CUSTOM_LIB] Sequence preprocess buffer manager initialized with stride: 1, subsample: 0
sequence_image_process.cpp:526, [INFO: CUSTOM_LIB] SequenceImagePreprocess initialized successfully
Using user provided processing height = 224 and processing width = 224
Decodebin child added: source
Decodebin child added: decodebin0
Decodebin child added: source
Decodebin child added: decodebin1
Running…
Decodebin child added: qtdemux0
Decodebin child added: qtdemux1
Decodebin child added: multiqueue0
Decodebin child added: multiqueue1
Decodebin child added: h264parse0
Decodebin child added: h264parse1
Decodebin child added: capsfilter0
Decodebin child added: capsfilter1
Decodebin child added: aacparse0
Decodebin child added: aacparse1
Decodebin child added: avdec_aac0
Decodebin child added: avdec_aac1
Decodebin child added: nvv4l2decoder0
Decodebin child added: nvv4l2decoder1
In cb_newpad
In cb_newpad
In cb_newpad
In cb_newpad
WARNING: nvdsinfer_backend.cpp:157 Backend context bufferIdx(0) request dims:1x96x224x224 is out of range, [min: 1x9x224x224, max: 1x9x224x224]
ERROR: nvdsinfer_backend.cpp:472 Failed to enqueue buffer in fulldims mode because binding idx: 0 with batchDims: 1x96x224x224 is not supported
ERROR: nvdsinfer_context_impl.cpp:1711 Infer context enqueue buffer failed, nvinfer error:NVDSINFER_INVALID_PARAMS
0:00:18.131974696 4773 0x55baf1f3acc0 WARN nvinfer gstnvinfer.cpp:2009:gst_nvinfer_process_tensor_input: error: Failed to queue input batch for inferencing
ERROR from element primary-nvinference-engine: Failed to queue input batch for inferencing
Error details: gstnvinfer.cpp(2009): gst_nvinfer_process_tensor_input (): /GstPipeline:preprocess-test-pipeline/GstNvInfer:primary-nvinference-engine
Returned, stopping playback
sequence_image_process.cpp:586, [INFO: CUSTOM_LIB] SequenceImagePreprocess is deinitializing
Deleting pipeline

Did you set correct deepstream_action_recognition_config.txt?
Please follow ActionRecognitionNet — TAO Toolkit 3.22.05 documentation >> DeepStream 3D Action Recognition App — DeepStream 6.1.1 Release documentation to setup.

Hi @Morganh,thanks for your reply.
Yes I have set the correct file - deepstream_action_recognition_config.txt and also have followed this - DeepStream 3D Action Recognition App — DeepStream 6.1.1 Release documentation
Have made the specific changes in the network-input-shape for the respective 2d and 3d config_infer_primary_2d_action.txt/config_infer_primary_3d_action.txt files and then have executed these commands
$ make
$ make install, but still the same error is shown if I try to run the 2d or the 3d model

Why the log shows “Load new model:config_infer_primary_2d_action.txt sucessfully” when you run deepstream-3d-action-recognition ?

Thanks for your reply @Morganh .
When I run deepstream-3d-action-recognition it shows "Load new model:config_infer_primary_2d_action.txt sucessfully” because when I run this command -
./deepstream-3d-action-recognition -c deepstream_action_recognition_config.txt ,
deepstream_action_recognition_config.txt contains

preprocess-config=config_preprocess_2d_custom.txt
infer-config=config_infer_primary_2d_action.txt

For reference - deepstream_action_recognition_config.txt

[action-recognition]

# stream/file source list
uri-list= file:///home/divya/Downloads/ride.mp4;file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_ride_bike.mov
#file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_walk.mov;
#file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_ride_bike.mov;file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_push.mov;file:///home/divya/Downloads/walk.mp4


# eglglessink settings
display-sync=1


# <preprocess-config> is the config file path for nvdspreprocess plugin
# <infer-config> is the config file path for nvinfer plugin

# Enable 3D preprocess and inference
#preprocess-config=config_preprocess_3d_custom.txt
#infer-config=config_infer_primary_3d_action.txt

# Uncomment to enable 2D preprocess and inference
preprocess-config=config_preprocess_2d_custom.txt
infer-config=config_infer_primary_2d_action.txt

# nvstreammux settings
muxer-height=720
muxer-width=1280

# nvstreammux batched push timeout in usec
muxer-batch-timeout=40000


# nvmultistreamtiler settings
tiler-height=720
tiler-width=1280

# Log debug level. 0: disabled. 1: debug. 2: verbose.
debug=0

# Enable fps print on screen. 0: disable. 1: enable
enable-fps=1

config_infer_primary_2d_action.txt -

[property]
gpu-id=0

tlt-encoded-model=/home/divya/computer_vision/cv_samples_v1.4.0/action_recognition_net/results/export_2d/rgb_resnet18_2.etlt
tlt-model-key=nvidia_tao
model-engine-file=/home/divya/computer_vision/cv_samples_v1.4.0/action_recognition_net/results/export_2d/rgb_resnet18_2.etlt_b1_gpu0_fp16.engine

labelfile-path=/opt/nvidia/deepstream/deepstream-6.1/sources/apps/sample_apps/deepstream-3d-action-recognition/labels.txt
batch-size=1
process-mode=1

# requires preprocess metadata input
input-tensor-from-meta=1

## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=2
gie-unique-id=1

# 1: classifier, 100: custom
network-type=1

# Let application to parse the inference tensor output
output-tensor-meta=1
tensor-meta-pool-size=8

config_preprocess_2d_custom.txt -


[property]
enable=1
target-unique-ids=1

# network-input-shape: batch, channel x sequence, height, width
# 2D sequence of 64 images
#network-input-shape= 4;96;224;224

# 2D sequence of 32 images
network-input-shape= 4;96;224;224

    # 0=RGB, 1=BGR, 2=GRAY
network-color-format=0
    # 0=NCHW, 1=NHWC, 2=CUSTOM
network-input-order=2
    # 0=FP32, 1=UINT8, 2=INT8, 3=UINT32, 4=INT32, 5=FP16
tensor-data-type=0
tensor-name=input_rgb

processing-width=224
processing-height=224

    # 0=NVBUF_MEM_DEFAULT 1=NVBUF_MEM_CUDA_PINNED 2=NVBUF_MEM_CUDA_DEVICE
    # 3=NVBUF_MEM_CUDA_UNIFIED  4=NVBUF_MEM_SURFACE_ARRAY(Jetson)
scaling-pool-memory-type=0
    # 0=NvBufSurfTransformCompute_Default 1=NvBufSurfTransformCompute_GPU
    # 2=NvBufSurfTransformCompute_VIC(Jetson)
scaling-pool-compute-hw=0
    # Scaling Interpolation method
    # 0=NvBufSurfTransformInter_Nearest 1=NvBufSurfTransformInter_Bilinear 2=NvBufSurfTransformInter_Algo1
    # 3=NvBufSurfTransformInter_Algo2 4=NvBufSurfTransformInter_Algo3 5=NvBufSurfTransformInter_Algo4
    # 6=NvBufSurfTransformInter_Default
scaling-filter=0

# model input tensor pool size
tensor-buf-pool-size=8

custom-lib-path=/opt/nvidia/deepstream/deepstream/lib/libnvds_custom_sequence_preprocess.so
#custom-lib-path=./custom_sequence_preprocess/libnvds_custom_sequence_preprocess.so
custom-tensor-preparation-function=CustomSequenceTensorPreparation

# 2D conv custom params
[user-configs]
channel-scale-factors=0.007843137;0.007843137;0.007843137
channel-mean-offsets=127.5;127.5;127.5
stride=1
subsample=0

[group-0]
src-ids=0;1;2;3
process-on-roi=1
roi-params-src-0=0;0;1280;720
roi-params-src-1=0;0;1280;720
roi-params-src-2=0;0;1280;720
roi-params-src-3=0;0;1280;720

Hi,
May I know if you can run the app with official model successfully?

You can running your own model, right?

Yes, I can run the app with official model successfully.
And now I’m trying to run my own model

Can you share the training spec file?

Sure.
train_rgb_2d_finetune.yaml (762 Bytes)

Can you check all the spec file where you have changed? Then please share with us.

Sure, @Morganh
evaluate_rgb.yaml (429 Bytes)
train_rgb_2d_finetune.yaml (762 Bytes)

infer_rgb.yaml -

model_config:
  model_type: rgb
  backbone: resnet18
  rgb_seq_length: 3
  input_type: 2d
  sample_strategy: consecutive 
  dropout_ratio: 0.0
dataset_config:
  label_map:
    fall_floor: 0
    ride_bike: 1
  output_shape:
  - 224
  - 224
  batch_size: 32
  workers: 8
  augmentation_config:
    train_crop_type: no_crop
    horizontal_flip_prob: 0.0
    rgb_input_mean: [0.5]
    rgb_input_std: [0.5]
    val_center_crop: False

export_rgb.yaml -

model_config:
  model_type: rgb
  backbone: resnet18
  rgb_seq_length: 3
  input_type: 2d
  sample_strategy: consecutive 
  dropout_ratio: 0.0
dataset_config:
  label_map:
    fall_floor: 0
    ride_bike: 1
  output_shape:
  - 224
  - 224
  batch_size: 32
  workers: 8
  augmentation_config:
    train_crop_type: no_crop
    horizontal_flip_prob: 0.0
    rgb_input_mean: [0.5]
    rgb_input_std: [0.5]
    val_center_crop: False

Can you modify below
network-input-shape= 4;96;224;224

to
network-input-shape= 4;9;224;224

and retry?

Hi @Morganh, yes the tiler is displaying with no errors but the inference is not running.
For reference attaching an image.
Also will you explain why did we change from 96 to 9 ? Thanks

According to DeepStream 3D Action Recognition App — DeepStream 6.1.1 Release documentation

network-input-shape= 4;96;224;224

It means max_batch_size: 4, channels 3, sequence_len: 32, height 224, width 224. where 96 = channels x sequence_len.

I am afraid the sequence_len is not 32 when you train your model. That is the reason why I just request to check the spec file during your training.

What exactly is meant by sequence length?
Also at what part of training we have to mention the sequence length?Thanks

Refer to ActionRecognitionNet — TAO Toolkit 3.22.05 documentation.
rgb_seq_length: The number of RGB frames for single inference

1 Like

For the inference, to narrow down, please use standalone sample to check.
https://docs.nvidia.com/tao/tao-toolkit/text/action_recognition_net.html#running-actionrecognitionnet-inference-on-the-stand-alone-sample
tao_toolkit_recipes/tao_action_recognition/tensorrt_inference at main · NVIDIA-AI-IOT/tao_toolkit_recipes · GitHub

There is no update from you for a period, assuming this is not an issue anymore.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.