DeepStream 7.1 nvinferserver tensor clone error

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU)
NVIDIA RTX A4500
• DeepStream Version
7.1
• JetPack Version (valid for Jetson only)
• TensorRT Version
default tensorrt in nvcr.io/nvidia/deepstream:7.1-triton-multiarch
• NVIDIA GPU Driver Version (valid for GPU only)
Driver Version: 535.161.08
CUDA Version: 12.2
• Issue Type( questions, new requirements, bugs)
bugs
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
Hello
When using the test1_app in pipeline_api, if the model is inferred with nvinfer and output-tensor-meta is set to true, the Probe method can be used normally to clone the tensor or copy it to the CPU.

  • type: nvinfer
    name: infer
    properties:
    config-file-path: /opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-test1/dstest1_pgie_config.yml
    output-tensor-meta: true

Probe code:
class Outputtensor(BatchMetadataOperator):
def handle_metadata(self, batch_meta):
for frame_meta in batch_meta.frame_items:
for user_meta in frame_meta.tensor_items:
for n, tensor in user_meta.as_tensor_output().get_layers().items():
print(f"tensor name: {n}“)
print(f"tensor object: {tensor}”)
# operations on tensors:
# torch_tensor = torch.utils.dlpack.from_dlpack(tensor).to(‘cpu’)
torch_tensor = torch.utils.dlpack.from_dlpack(tensor.clone())
print(torch_tensor)

pipeline:
Pipeline(PIPELINE_NAME, file_path).attach(“infer1”, Probe(“output”, Outputtensor())).start().wait()

However, when using the nvinferserver approach and setting output_control {output_tensor_meta: true} in the configuration file, an error occurs when using the Probe method to clone the tensor.
dstest1_config.yaml:

  • type: nvinferserver
    name: infer
    properties:
    config-file-path: /workspace/DeepStream/yolo_triton_dp/config_infer_triton_yolov7.txt

config_infer_triton_yolov7.txt :

infer_config {
unique_id: 1
gpu_ids: [0]
max_batch_size: 1
backend {
triton {
model_name: “yolov7”
version: -1
model_repo {
root: “/workspace/DeepStream/yolo_triton_dp/models”
}
}
}
preprocess {
network_format: IMAGE_FORMAT_RGB
tensor_order: TENSOR_ORDER_LINEAR
maintain_aspect_ratio: 1
frame_scaling_hw: FRAME_SCALING_HW_DEFAULT
frame_scaling_filter: 1
normalize {
scale_factor: 0.0039215697906911373
}
}
postprocess {
labelfile_path: “/workspace/DeepStream/yolo_triton_dp/classes.txt”
detection {
num_detected_classes: 80
custom_parse_bbox_func: “NvDsInferParseCustomEfficientNMS”
nms {
confidence_threshold: 0.5
topk: 300
iou_threshold: 0.45
}
}
}
extra {
copy_input_to_host_buffers: false
}
custom_lib {
path : “/workspace/DeepStream/yolo_triton_dp/lib/libnvds_infercustomparser.so”
}
}
input_control {
process_mode : PROCESS_MODE_FULL_FRAME
interval : 0
}
output_control {
output_tensor_meta: true
}

error info:

tensor name: num_dets
tensor object: <pyservicemaker._pydeepstream.Tensor object at 0x7f1fc4246d70>
CUDA memcpy failed
Segmentation fault (core dumped)
OR:
{‘822’: <pyservicemaker._pydeepstream.Tensor object at 0x7f16386e41f0>, ‘823’: <pyservicemaker._pydeepstream.Tensor object at 0x7f16386e4270>, ‘output0’: <pyservicemaker._pydeepstream.Tensor object at 0x7f16386e42f0>}
<pyservicemaker._pydeepstream.Tensor object at 0x7f16386e41f0> <pyservicemaker._pydeepstream.Tensor object at 0x7f16386e4270>
ERROR: QueueThread:GstInferServImpl internal unexpected error, may cause stop

The two aforementioned error messages occurred in different scenarios: the first appeared in a pipeline I built by writing my own configuration file, while the second occurred in a pipeline built using the configuration file from pipeline_apps’s test1_app. However, both errors seem to point to torch_tensor = torch.utils.dlpack.from_dlpack(tensor.clone()) or torch_tensor = torch.utils.dlpack.from_dlpack(tensor).to('cpu').

Could there be an issue with my configuration file? Or am I using the method incorrectly? I would greatly appreciate your help in addressing these questions. Thank you very much!
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

if only adding the following modification, will “clone the tensor or copy it to the CPU.” still be fine?

type: nvinferserver
name: infer
properties:
config-file-path: /opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-test1/dstest1_pgie_nvinferserver_config.txt

dstest1_pgie_nvinferserver_config.txt :

infer_config {
unique_id: 1
gpu_ids: [0]
max_batch_size: 30
backend {
inputs: [ {
name: “input_1:0”
}]
outputs: [
{name: “output_cov/Sigmoid:0”},
{name: “output_bbox/BiasAdd:0”}
]
triton {
model_name: “Primary_Detector”
version: -1
model_repo {
root: “…/…/…/…/samples/triton_model_repo”
strict_model_config: true
}
}
}
preprocess {
network_format: MEDIA_FORMAT_NONE
tensor_order: TENSOR_ORDER_LINEAR
tensor_name: “input_1:0”
maintain_aspect_ratio: 0
frame_scaling_hw: FRAME_SCALING_HW_DEFAULT
frame_scaling_filter: 1
normalize {
scale_factor: 0.00392156862745098
channel_offsets: [0, 0, 0]
}
}
postprocess {
labelfile_path: “…/…/…/…/samples/models/Primary_Detector/labels.txt”
detection {
num_detected_classes: 4
per_class_params {
key: 0
value { pre_threshold: 0.4 }
}
nms {
confidence_threshold:0.2
topk:20
iou_threshold:0.5
}
}
}
extra {
copy_input_to_host_buffers: false
output_buffer_pool_size: 2
}
}
input_control {
process_mode: PROCESS_MODE_FULL_FRAME
operate_on_gie_id: -1
interval: 0
}
output_control {
output_tensor_meta: true
}

dstest1_config.yaml :

deepstream:
nodes:- type: filesrc
name: filesrc
properties:
location: /opt/nvidia/deepstream/deepstream/samples/streams/sample_720p.h264
-type: h264parse
name: h264parse
-type: nvv4l2decoder
name: nvv4l2decoder
-type: nvstreammux
name: nvstreammux
properties:
batch-size: 1
width: 1280
height: 720
#-type: nvinfer
#name: infer
#properties:
#config-file-path: /opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-test1/dstest1_pgie_config.yml
#output-tensor-meta: true
-type: nvinferserver
name: infer
properties:
config-file-path: /opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-test1/dstest1_pgie_nvinferserver_config.txt
-type: nvvideoconvert
name: nvvideoconvert
-type: nvdsosd
name: nvdsosd
-type: nvrtspoutsinkbin
name: sink
properties:
sync: false
-type: sample_video_probe.sample_video_probe
name: samplevideoprobe
properties:
font-size: 15
edges:
filesrc: h264parse
h264parse: nvv4l2decoder
nvv4l2decoder: nvstreammux
nvstreammux: infer
infer: nvvideoconvert
nvvideoconvert: nvdsosd
nvdsosd: sink

Probe code:

print(output_layers)
bbox_data = output_layers.pop(‘output_bbox/BiasAdd:0’, None)
conv_data = output_layers.pop(‘output_cov/Sigmoid:0’, None)
print(‘aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa’,bbox_data,conv_data)
n_classes = conv_data.shape[0]
if bbox_data and conv_data:
bbox_tensor = torch.utils.dlpack.from_dlpack(bbox_data).to(‘cpu’)
conv_tensor = torch.utils.dlpack.from_dlpack(conv_data).to(‘cpu’)
print(bbox_tensor,conv_tensor)

run Pipeline

GST_DEBUG=3 python3 deepstream_test1.py dstest1_config.yaml

Now, I have created the pipeline based on the configuration file you mentioned and used the above code to print the information. However, the same error issue still occurred.:

{‘output_cov/Sigmoid:0’: <pyservicemaker._pydeepstream.Tensor object at 0x7f20abbf48b0>, ‘output_bbox/BiasAdd:0’: <pyservicemaker._pydeepstream.Tensor object at 0x7f20abbf48f0>}
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa <pyservicemaker._pydeepstream.Tensor object at 0x7f20abbf48f0> <pyservicemaker._pydeepstream.Tensor object at 0x7f20abbf48b0>
0:00:00.749916441 500 0x7f2084000d80 WARN v4l2bufferpool gstv4l2bufferpool.c:1130:gst_v4l2_buffer_pool_start:<sink-video_encoder:pool:src> Uncertain or not enough buffers, enabling copy threshold
ERROR: QueueThread:GstInferServImpl internal unexpected error, may cause stop
{‘output_cov/Sigmoid:0’: <pyservicemaker._pydeepstream.Tensor object at 0x7f20abbf48f0>, ‘output_bbox/BiasAdd:0’: <pyservicemaker._pydeepstream.Tensor object at 0x7f20abbf5570>}
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa <pyservicemaker._pydeepstream.Tensor object at 0x7f20abbf5570> <pyservicemaker._pydeepstream.Tensor object at 0x7f20abbf48f0>
ERROR: QueueThread:GstInferServImpl internal unexpected error, may cause stop

Thanks for the sharing! after adding following code, I am not able to reproduce this issue on Dgpu with DS71.
configurations:

  - type: nvinferserver
    name: infer
    properties:
      config-file-path: /opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-test1/dstest1_pgie_nvinferserver_config.txt

code:

in ResnetDetectorConverter
 bbox_tensor = torch.utils.dlpack.from_dlpack(bbox_data).to('cpu')
 torch_tensor = torch.utils.dlpack.from_dlpack(bbox_data.clone())

log-1124.txt (4.1 KB)

Thank you for your response! I think the key to reproducing this issue lies in the fact that I manually modified the code in the main function:

def main(file_path):
file_ext = os.path.splitext(file_path)[1]
if file_ext in [“.yaml”, “.yml”]:
pipeline = Pipeline(PIPELINE_NAME, file_path)
#output_tensor_meta = pipeline[“infer”].get(“output-tensor-meta”)
output_tensor_meta = True
print(‘zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz’,output_tensor_meta)
if output_tensor_meta:
# disabling object meta from nvinfer plugin for using customized tensor converter
pipeline[“infer”].set({“filter-out-class-ids”: “0;1;2;3”})
pipeline.attach(“infer”, Probe(“counter”, ObjectCounterMarker(output_tensor_meta))).start().wait()

Since the nvinferserver mode does not have the output-tensor-meta attribute, I set output-tensor-meta to True in the code to enable the code to enter the data analysis branch:

class ObjectCounterMarker(BatchMetadataOperator):
def init(self, output_tensor_meta):
super().init()
self._use_tensor = output_tensor_meta
self._postprocessing = ResnetDetectorConverter() if output_tensor_meta else None
def handle_metadata(self, batch_meta):
for frame_meta in batch_meta.frame_items:
vehicle_count = 0
person_count = 0
if self._use_tensor:
for user_meta in frame_meta.tensor_items:

If output-tensor-meta is not set to True in the main function, the line:

output_tensor_meta = pipeline[“infer”].get(“output-tensor-meta”)
#output_tensor_meta = None

will result in output_tensor_meta being None, which prevents the code from reaching the branch where:

bbox_tensor = torch.utils.dlpack.from_dlpack(bbox_data).to(‘cpu’)
torch_tensor = torch.utils.dlpack.from_dlpack(bbox_data.clone())

I apologize for not mentioning this crucial point in the previous discussion!

Based on my last comment, after adding the following modifications, I am able to reproduce this “internal unexpected error, may cause stop” error now. we are investigating.

#in dstest1_pgie_nvinferserver_config.txt
output_control {
output_tensor_meta: true
}
#in def main of deepstream_test1.py
output_tensor_meta=True

Thank you!

Sorry for the late rely! Here is a workaround.

  1. enter /opt/nvidia/deepstream/deepstream/sources/gst-plugins/gst-nvinferserver. In attachTensorOutputMeta of gstnvinferserver_meta_utils.cpp, add the following modification.
        meta->out_buf_ptrs_host[i] = bufPtr;
        //meta->out_buf_ptrs_dev[i] = nullptr;   // line 500
        meta->out_buf_ptrs_dev[i] = bufPtr;  //new code
  1. execute make to generate libnvdsgst_inferserver.so. then run the following cmd.
mv /opt/nvidia/deepstream/deepstream/lib/gst-plugins/libnvdsgst_inferserver.so  /opt/nvidia/deepstream/deepstream/lib/gst-plugins/libnvdsgst_inferserver.so_ori && \
mv libnvdsgst_inferserver.so       /opt/nvidia/deepstream/deepstream/lib/gst-plugins/libnvdsgst_inferserver.so
  1. run the app again. Here is my test loglog-1124.txt (4.1 KB).

Thank you for your work!

I have another question unrelated to the current topic, and I wonder if you can help me with it.
I am using the Python backend in Triton Server, and after processing, I return custom JSON data. Now, I need to retrieve the JSON data in the Probe. Is this approach feasible, or do I need to modify the nvinferserver source code to accept custom results? or can I modify the source code of custom_lib to accept custom data?

data = {
“video_has_board”: “ok”,
“video_distance”: self.video_distance,
“video_angle_ok”: self.video_angle_ok,
“focal_length”: self.focal_length,
“normal_vector”: self.normal_vector.tolist(),
“D”: self.D.tolist()
}
result = json.dumps(data, ensure_ascii=False, separators=(‘,’,‘:’))
out = np.array(result)
out_tensor = pb_utils.Tensor(‘results’, out.astype(output_dtype))
inference_response = pb_utils.InferenceResponse(output_tensors=[out_tensor])
responses.append(inference_response)
return responses

Pipeline error:

INFO: TritonGrpcBackend id:1 initialized for model: blackboard_main
Segmentation fault (core dumped)

could you open a new topic? Thanks! let’s focus on the original issue in one topic.

ok thank you!