Receiving "malloc_consolidate(): unaligned fastbin chunk detected" when loading model in DeepStream 6.4 on Jetson Orin Nano 8GB

david.micksch · April 3, 2024, 7:29pm

I have an onnx engine that works fine on a Jetson XavierNX and a Jetson TX2 NX device. When I try and run the same model on a Jetson Orin Nano 8GB device it emits the following malloc error:

WARNING: [TRT]: Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors.
0:00:28.188148599 116580 0xaaaaf09f4b30 INFO                 nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger:<person-detector-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:2092> [UID = 1]: deserialized trt engine from :../models/person_detection/person_detection_orin_fp16.engine
WARNING: [TRT]: The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
INFO: [Implicit Engine Info]: layers num: 3
0   INPUT  kFLOAT input_1:0       3x544x960       
1   OUTPUT kFLOAT output_cov/Sigmoid:0 1x34x60         
2   OUTPUT kFLOAT output_bbox/BiasAdd:0 4x34x60         

ERROR: [TRT]: 3: Cannot find binding of given name: conv2d_bbox
0:00:28.601311011 116580 0xaaaaf09f4b30 WARN                 nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<person-detector-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::checkBackendParams() <nvdsinfer_context_impl.cpp:2059> [UID = 1]: Could not find output layer 'conv2d_bbox' in engine
ERROR: [TRT]: 3: Cannot find binding of given name: conv2d_cov/Sigmoid.
0:00:28.601361028 116580 0xaaaaf09f4b30 WARN                 nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<person-detector-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::checkBackendParams() <nvdsinfer_context_impl.cpp:2059> [UID = 1]: Could not find output layer 'conv2d_cov/Sigmoid.' in engine
0:00:28.601379557 116580 0xaaaaf09f4b30 INFO                 nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger:<person-detector-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2195> [UID = 1]: Use deserialized engine model: ../models/person_detection/person_detection_orin_fp16.engine
malloc_consolidate(): unaligned fastbin chunk detected

I generated the engine plan using the docker image nvcr.io/nvidia/l4t-tensorrt:r8.6.2-runtime.

I flashed the Jetson Orin Nano device using jetson_linux_r36.2.0_aarch64.tbz2 and tegra_linux_sample-root-filesystem_r36.2.0_aarch64.tbz2.

This is the python code that builds the inference detector:

    person_detector = make_elm_or_print_err("nvinfer", "person-detector-engine", "Person Detector")
    person_detector.set_property("config-file-path", "../model_configs/person_detection/person_detection.txt")
    person_detector.set_property("unique-id", inference_common.PERSON_DETECTOR_UID)
    person_detector.set_property("model-engine-file", f"../models/person_detection/person_detection_{PROCESSOR_TYPE}_fp16.engine")

This is the contents of config file that it loads:

[property]

# The following is generated by the transfer learning toolkit (TAO) and should be replaced after updating the model
net-scale-factor=0.00392156862745098
offsets=0.0;0.0;0.0
infer-dims=3;544;960
tlt-model-key=********
network-type=0
num-detected-classes=1
model-color-format=0
maintain-aspect-ratio=0
output-tensor-meta=0


# Device ID of GPU to use for pre-processing/inference (dGPU only)
gpu-id=0

# Pixel normalization factor (ignored if input-tensor-meta enabled)
# Really unclear of how this should be set, but this works.
#net-scale-factor=<this is set by TAO - see above>

# Pathname of the serialized model engine file
# In our application, this is set in code to get the proper processor type
# model-engine-file=../../models/person_detection/person_detection_xaviernx_fp16.engine

# Pathname of a text file containing the labels for the model
labelfile-path=../../models/person_detection/labels.txt

# Pathname of the TAO toolkit encoded model.
tlt-encoded-model=../../models/person_detection/person_detection.onnx
# Key for the TAO toolkit encoded model.
#tlt-model-key=<this is configured by TAO - see above>

# Pathname of the INT8 calibration file for dynamic range adjustment with an FP32 model.
#int8-calib-file=../../../../samples/models/Primary_Detector/cal_trt.bin

# When a network supports both implicit batch dimension and full dimension, force the implicit batch dimension mode.
force-implicit-batch-dim=1

# Number of frames or objects to be inferred together in a batch.
batch-size=1

# Data format to be used by inference.  Integer 0: FP32 1: INT8 2: FP16.
network-mode=2

# Number of classes detected by the network
#num-detected-classes=<this is configured by TAO - see above>

# Specifies the number of consecutive batches to be skipped for inference.
interval=0

# Unique ID to be assigned to the GIE to enable the application and other elements to identify detected bounding boxes and labels.
gie-unique-id=1

# Filter out detected objects belonging to specified class-ids.  Semicolon delimited integer array.
# 1;2 are bags and heads
#filter-out-class-ids=1;2

# Array of output layer names. Semicolon delimited string array.
output-blob-names=conv2d_bbox;conv2d_cov/Sigmoid.
#scaling-filter=0
#scaling-compute-hw=0

# Clustering algorithm to use. Refer to the next table for configuring the algorithm specific parameters.
# Integer 0: OpenCV groupRectangles() 1: DBSCAN 2: Non Maximum Suppression 3: DBSCAN + NMS Hybrid 4: No clustering.
cluster-mode=2

# Detection threshold
#threshold=0.01

[class-attrs-all]
# Detection threshold to be applied prior to clustering operation
pre-cluster-threshold=0.01

# Keep only top K objects with highest detection scores.  Better definition in DeepStream 6.2 docs: Specify top k detection results to keep after nms, where 0 means keep all.
topk=20

# Maximum IOU score between two proposals after which the proposal with the lower confidence will be rejected.
nms-iou-threshold=0.2

The above work fine on the Jetson Xavier NX and Jetson TX2 NX.

fanzh · April 8, 2024, 5:55am

There is no update from you for a period, assuming this is not an issue any more. Hence we are closing this topic. If need further support, please open a new one. Thanks.

about error “Cannot find binding of given name: conv2d_bbox”, please update output-blob-names configurations.
if using onnx, please use “onnx-file=”. please refer to this sample.

system · April 22, 2024, 2:25pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Engine file and calib.table not saved in DeepStream DeepStream SDK tensorrt	6	897	December 25, 2022
No detection after conversion to engine in deepstream DeepStream SDK deepstream	7	31	May 12, 2025
How to generate a tensorrt model that is supported by Deesptream sdk DeepStream SDK	17	550	January 29, 2024
Uncertain or not enough buffers, enabling copy threshold Segmentation fault DeepStream SDK jetson-inference , deepstream	6	1164	June 19, 2023
ERROR: Failed to enqueue trt inference batch in deepstream-app DeepStream SDK tensorrt , cuda	6	820	April 13, 2023
Issues running Onnx classifier model in deepstream DeepStream SDK tensorrt , onnx	5	1677	October 12, 2021
Error with converting onnx model DeepStream SDK	6	669	June 19, 2022
Batch-size with 9 rtsp streams DeepStream SDK hw , cuda , gstreamer	16	1691	October 12, 2021
How to use onnx file with deepstream-test1-usbcam + Custom models DeepStream SDK	30	4652	October 12, 2021
Tensorflow object detection api 2.x model in deepstream 6.0 DeepStream SDK	4	997	December 28, 2021

Receiving "malloc_consolidate(): unaligned fastbin chunk detected" when loading model in DeepStream 6.4 on Jetson Orin Nano 8GB

Related topics