Deepstream python works with grpc but gets stuck on using model_repo

pulkitmishra · May 29, 2023, 8:57am

Please provide complete information as applicable to your setup.

• Hardware Platform GPU
• DeepStream Version 6.2
• TensorRT Version 8.5.2-1+cuda11.8
• NVIDIA GPU Driver Version (valid for GPU only) 525.105.17
• Issue Type( questions, new requirements, bugs) questions/bugs

I have a deepstream python (with pgie and one sgie) app that works well when the config is

backend {
triton {
model_name: “yolov8_nms_tensorrt”
version: -1
grpc {
url: “127.0.0.1:8001”
enable_cuda_buffer_sharing: true
}
}
}

but freezes after a few frames when the config is

backend {
triton {
model_name: “yolov8_nms_tensorrt”
version: -1
model_repo{
root: “/opt/nvidia/deepstream/deepstream-6.2/people-app/model_repo”
log_level: 2
strict_model_config: 1
}
}
}

Am i doing something wrong here or missing something ?

fanzh · May 29, 2023, 2:58pm

to narrow down this issue, can you do the following check：

can the application run fine only using pgie?
about “but freezes after a few frames”, can you see the output result?
could you share more logs? please do “export GST_DEBUG=6” first to modify Gstreamer’s log level, then run again, you can redirect the logs to a file.

pulkitmishra · May 30, 2023, 5:54am

@fanzh

no it can not
i set log level 3 in

backend {
triton {
model_name: “yolov8_nms_tensorrt”
version: -1
model_repo{
root: “/opt/nvidia/deepstream/deepstream-6.2/people-app/model_repo”
log_level: 3
strict_model_config: 1
}
}
}

logs are as follows

I0530 06:21:06.526065 52 tensorrt.cc:5711] model yolov8_nms_tensorrt, instance yolov8_nms_tensorrt, executing 1 requests
I0530 06:21:06.526095 52 tensorrt.cc:1736] TRITONBACKEND_ModelExecute: Issuing yolov8_nms_tensorrt with 1 requests
I0530 06:21:06.526107 52 tensorrt.cc:1795] TRITONBACKEND_ModelExecute: Running yolov8_nms_tensorrt with 1 requests
I0530 06:21:06.526132 52 tensorrt.cc:2925] Optimization profile default [0] is selected for yolov8_nms_tensorrt
I0530 06:21:06.526210 52 tensorrt.cc:2299] Context with profile default [0] is being executed for yolov8_nms_tensorrt
I0530 06:21:06.527609 52 infer_response.cc:167] add response output: output: num_dets, type: INT32, shape: [1,1]
I0530 06:21:06.527662 52 infer_response.cc:167] add response output: output: bboxes, type: FP32, shape: [1,100,4]
I0530 06:21:06.527691 52 infer_response.cc:167] add response output: output: scores, type: FP32, shape: [1,100]
I0530 06:21:06.527716 52 infer_response.cc:167] add response output: output: labels, type: INT32, shape: [1,100]
I0530 06:21:06.529633 52 tensorrt.cc:2782] TRITONBACKEND_ModelExecute: model yolov8_nms_tensorrt released 1 requests
I0530 06:21:06.567000 52 infer_request.cc:713] [request id: 1444] prepared: [0x0x7efbdc00a960] request id: 1444, model: yolov8_nms_tensorrt, requested version: -1, actual version: 1, flags: 0x0, correlation id: 0, batch size: 1, priority: 0, timeout (us): 0
original inputs:
[0x0x7efbdc009118] input: images, type: FP32, original shape: [1,3,640,640], batch + shape: [1,3,640,640], shape: [3,640,640]
override inputs:
inputs:
[0x0x7efbdc009118] input: images, type: FP32, original shape: [1,3,640,640], batch + shape: [1,3,640,640], shape: [3,640,640]
original requested outputs:
bboxes
labels
num_dets
scores
requested outputs:
bboxes
labels
num_dets
scores

I0530 06:21:06.567060 52 tensorrt.cc:5711] model yolov8_nms_tensorrt, instance yolov8_nms_tensorrt, executing 1 requests
I0530 06:21:06.567083 52 tensorrt.cc:1736] TRITONBACKEND_ModelExecute: Issuing yolov8_nms_tensorrt with 1 requests
I0530 06:21:06.567095 52 tensorrt.cc:1795] TRITONBACKEND_ModelExecute: Running yolov8_nms_tensorrt with 1 requests
I0530 06:21:06.567110 52 tensorrt.cc:2925] Optimization profile default [0] is selected for yolov8_nms_tensorrt
I0530 06:21:06.567170 52 tensorrt.cc:2299] Context with profile default [0] is being executed for yolov8_nms_tensorrt
I0530 06:21:06.570395 52 infer_response.cc:167] add response output: output: num_dets, type: INT32, shape: [1,1]
I0530 06:21:06.570442 52 infer_response.cc:167] add response output: output: bboxes, type: FP32, shape: [1,100,4]
I0530 06:21:06.570469 52 infer_response.cc:167] add response output: output: scores, type: FP32, shape: [1,100]
I0530 06:21:06.570488 52 infer_response.cc:167] add response output: output: labels, type: INT32, shape: [1,100]

**PERF: {‘stream0’: 5.19}

**PERF: {‘stream0’: 0.0}

**PERF: {‘stream0’: 0.0}

**PERF: {‘stream0’: 0.0}

**PERF: {‘stream0’: 0.0}

after this everything just freezes and i just repeatedly get **PERF: {‘stream0’: 0.0}

will share in a while

pulkitmishra · May 30, 2023, 10:11am

@fanzh

generates a huge file and so am unable to upload here but the last 20 lines are as follows - seems like there is issue with rtsp output when using model_repo ?

0:05:32.601294471 24 0:05:32.601308870 24 0:05:32.601393168 24 0:05:32.626421092 24 0:05:32.626442992 24 0:05:32.626449492 24 0:05:32.626456691 24 0:05:32.626478791 24 0:05:32.626485490 24 0:05:32.626491090 24 0:05:32.626498490 24 0:05:32.626503090 24 0:05:32.626511490 24 0:05:32.626518489 24 0:05:32.626524489 24 0:05:32.626532289 24 0:05:32.626537989 24 0:05:32.626544189 24 0:05:32.626555288 24 0:05:32.626561588 24 0x2f3eaa0 DEBUG rtspsrc gstrtspsrc.c:5617:gst_rtspsrc_loop_udp: timeout, sending keep-alive
0x2f3eaa0 DEBUG rtspsrc gstrtspsrc.c:5190:gst_rtspsrc_send_keep_alive: creating server keep-alive
0x2f3eaa0 DEBUG rtspsrc gstrtspsrc.c:5597:gst_rtspsrc_loop_udp: doing receive with timeout 54 seconds
0x2f3eaa0 DEBUG rtspsrc gstrtspsrc.c:5610:gst_rtspsrc_loop_udp: we received a server message
0x2f3eaa0 DEBUG rtspsrc gstrtspsrc.c:5653:gst_rtspsrc_loop_udp: ignoring response message
0x2f3eaa0 LOG rtspsrc gstrtspsrc.c:9408:gst_rtspsrc_print_rtsp_message: --------------------------------------------
0x2f3eaa0 LOG rtspsrc gstrtspsrc.c:9430:gst_rtspsrc_print_rtsp_message: RTSP response message 0x7faa0905cda0
0x2f3eaa0 LOG rtspsrc gstrtspsrc.c:9431:gst_rtspsrc_print_rtsp_message: status line:
0x2f3eaa0 LOG rtspsrc gstrtspsrc.c:9432:gst_rtspsrc_print_rtsp_message: code: ‘200’
0x2f3eaa0 LOG rtspsrc gstrtspsrc.c:9433:gst_rtspsrc_print_rtsp_message: reason: ‘OK’
0x2f3eaa0 LOG rtspsrc gstrtspsrc.c:9434:gst_rtspsrc_print_rtsp_message: version: '1.0
0x2f3eaa0 LOG rtspsrc gstrtspsrc.c:9436:gst_rtspsrc_print_rtsp_message: headers:
0x2f3eaa0 LOG rtspsrc gstrtspsrc.c:9391:dump_key_value: key: ‘CSeq’, value: ‘10’
0x2f3eaa0 LOG rtspsrc gstrtspsrc.c:9391:dump_key_value: key: ‘Date’, value: ‘Tue, May 30 2023 11:45:52 GMT’
0x2f3eaa0 LOG rtspsrc gstrtspsrc.c:9391:dump_key_value: key: ‘Session’, value: ‘1269BF1E’
0x2f3eaa0 LOG rtspsrc gstrtspsrc.c:9391:dump_key_value: key: ‘Content-Length’, value: ‘10’
0x2f3eaa0 LOG rtspsrc gstrtspsrc.c:9439:gst_rtspsrc_print_rtsp_message: body: length 11
0x2f3eaa0 LOG rtspsrc gstrtspsrc.c:9442:gst_rtspsrc_print_rtsp_message: 2014.02.04(11)
0x2f3eaa0 LOG rtspsrc gstrtspsrc.c:9500:gst_rtspsrc_print_rtsp_message: --------------------------------------------
0x2f3eaa0 DEBUG rtspsrc gstrtspsrc.c:5597:gst_rtspsrc_loop_udp: doing receive with timeout 54 seconds

i replaced my pgie with peoplenet and still the same thing happens

I0530 08:32:17.149744 439 tensorrt.cc:5711] model peoplenet_tao, instance peoplenet_tao, executing 1 requests
I0530 08:32:17.149782 439 tensorrt.cc:1736] TRITONBACKEND_ModelExecute: Issuing peoplenet_tao with 1 requests
I0530 08:32:17.149796 439 tensorrt.cc:1795] TRITONBACKEND_ModelExecute: Running peoplenet_tao with 1 requests
I0530 08:32:17.149820 439 tensorrt.cc:2925] Optimization profile default [0] is selected for peoplenet_tao
I0530 08:32:17.149897 439 tensorrt.cc:2299] Context with profile default [0] is being executed for peoplenet_tao
I0530 08:32:17.150334 439 infer_response.cc:167] add response output: output: output_bbox/BiasAdd, type: FP32, shape: [4,12,34,60]
I0530 08:32:17.150383 439 infer_response.cc:167] add response output: output: output_cov/Sigmoid, type: FP32, shape: [4,3,34,60]
I0530 08:32:17.154146 439 tensorrt.cc:2782] TRITONBACKEND_ModelExecute: model peoplenet_tao released 1 requests
I0530 08:32:17.276134 439 infer_request.cc:713] [request id: 305] prepared: [0x0x7fca74008ed0] request id: 305, model: peoplenet_tao, requested version: -1, actual version: 1, flags: 0x0, correlation id: 0, batch size: 4, priority: 0, timeout (us): 0
original inputs:
[0x0x7fca740089f8] input: input_1, type: FP32, original shape: [4,3,544,960], batch + shape: [4,3,544,960], shape: [3,544,960]
override inputs:
inputs:
[0x0x7fca740089f8] input: input_1, type: FP32, original shape: [4,3,544,960], batch + shape: [4,3,544,960], shape: [3,544,960]
original requested outputs:
output_bbox/BiasAdd
output_cov/Sigmoid
requested outputs:
output_bbox/BiasAdd
output_cov/Sigmoid

I0530 08:32:17.276238 439 tensorrt.cc:5711] model peoplenet_tao, instance peoplenet_tao, executing 1 requests
I0530 08:32:17.276262 439 tensorrt.cc:1736] TRITONBACKEND_ModelExecute: Issuing peoplenet_tao with 1 requests
I0530 08:32:17.276273 439 tensorrt.cc:1795] TRITONBACKEND_ModelExecute: Running peoplenet_tao with 1 requests
I0530 08:32:17.276287 439 tensorrt.cc:2925] Optimization profile default [0] is selected for peoplenet_tao
I0530 08:32:17.276348 439 tensorrt.cc:2299] Context with profile default [0] is being executed for peoplenet_tao
I0530 08:32:17.280725 439 infer_response.cc:167] add response output: output: output_bbox/BiasAdd, type: FP32, shape: [4,12,34,60]

fanzh · June 1, 2023, 3:20am

There is no update from you for a period, assuming this is not an issue any more.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

can you reproducing the hung issue based on deepstream sample source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8.txt without modification?
from the log, there is no error output, please share the whole logs, you can upload a zip file.
can you use gdb to get a full call stack?

system · June 15, 2023, 3:21am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Streaming stopped, reason not-linked (-1) Segmentation fault (core dumped) DeepStream SDK	2	4488	October 12, 2021
Deepstream freezes on Jetson DeepStream SDK	13	2232	October 12, 2021
Bus error while running deepstream refrerence app DeepStream SDK	13	1842	October 12, 2021
DeepStream Python gets stuck with RTSP stream DeepStream SDK rtsp	14	1765	October 12, 2021
Deepstream with triton is stuck and not outputting anything DeepStream SDK inference-server-triton , inception	5	1110	September 19, 2022
DeepStream Python SSD : Not utilising GPU and it is slow DeepStream SDK	4	589	October 12, 2021
Some question about Deep stream 5 DeepStream SDK	42	2158	October 12, 2021
DeepStream6.1 freezes at a specific position DeepStream SDK deepstream61	19	1167	June 1, 2022
Issues running custom model in deepstream DeepStream SDK	2	516	October 12, 2021
deepstream-test4-app stalled after the first few frames DeepStream SDK	18	1353	October 12, 2021

Deepstream python works with grpc but gets stuck on using model_repo

Related topics