Pipelines fails to push into python appsink after introduction of tensorrt/nvinfer

What video decoder are you using for the pipeline? Can you post the pipeline graph here?

@Fiona.Chen
My bad, I messed up the log creation above
I’ll attach you a (almost full, but slightly redacted) log, with the GST_DEBUG you wanted + some indications from when Python was invoked, both the pad and the appsink.
Note that here I’ve made sure that the appsink is fully blocking, so no async processing with appsink or tensorrt

out.txt (842.1 KB)

The “e pulled” logs are the “sample pulled” logs from inside the appsink and note that the end of the logs go
“end of python” - last appsink exiting
“hello pad” - a pad probe I’ve added for debugging after this issue appears succeeds
but then not the expected "sample pulled for going into the appsink, instead the decoder hits again

The log looks fine. I don’t know what "but then not the expected “sample pulled for going into the appsink, instead the decoder hits again” mean?

Can you tell me which lines in the log shows "but then not the expected “sample pulled for going into the appsink, instead the decoder hits again”?

Sure, Here are the last couple of iterations in the pipeline. I’m numbering the appsinks according to iteration number (logs start at the end of appsink 0). I’ve CAPITALISED all of my comments, and they’rall on python print statements

end of python - END OF ONE APPSINK 
 1452e[00m 0x7f6650028050 e[37mDEBUG  e[00m e[00m        v4l2videodec gstv4l2videodec.c:1277:gst_v4l2_video_dec_handle_frame:<nvv4l2decoder0>e[00m Handling frame 443
0:00:55.114987577 e[331m 1452e[00m 0x7f662d65cd90 e[33;01mLOG    e[00m e[00m        v4l2videodec gstv4l2videodec.c:1124:gst_v4l2_video_dec_loop:<nvv4l2decoder0>e[00m Process output buffer
0:00:55.115342670 e[331m 1452e[00m 0x7f662d65cd90 e[33;01mLOG    e[00m e[00m         nvstreammux gstnvstreammux.c:362:gst_nvstreammux_chain:<Stream-muxer>e[00m Got buffer 0x7f662c0c8180 from source 0
0:00:55.115465645 e[331m 1452e[00m 0x7f662d65cd90 e[33;01mLOG    e[00m e[00m        v4l2videodec gstv4l2videodec.c:1102:gst_v4l2_video_dec_loop:<nvv4l2decoder0>e[00m Allocate output buffer
0:00:55.115475643 e[331m 1452e[00m 0x7f662c054590 e[37mTRACE  e[00m e[00m         nvstreammux gstnvstreammux.c:1572:gst_nvstreammux_src_collect_buffers:<Stream-muxer>e[00m SETTING CUDA DEVICE = 0 in nvstreammux  func=gst_nvstreammux_src_collect_buffers

0:00:55.115529802 e[331m 1452e[00m 0x7f662c054590 e[37mDEBUG  e[00m e[00m         nvstreammux gstnvstreammux.c:2196:gst_nvstreammux_src_push_loop:<Stream-muxer>e[00m Pushing buffer 0x7f662c0c84b0, batch size 1, PTS 0:00:35.526529880
0:00:55.115540112 e[331m 1452e[00m 0x7f662c054590 e[37mDEBUG  e[00m e[00m         nvstreammux gstnvstreammux.c:2218:gst_nvstreammux_src_push_loop:<Stream-muxer>e[00m STREAMMUX OUT BUFFER attached timestamp 0:00:35.526529880
hello pad - A PAD PROBE TEST ON THE YOLO SINK PAD
Inside pull sample - APPSINK ENTRYPOINT
sample pulled - APPSINK SUCCESSFULLY OBTAINED GST_BUFFER
2020-11-19 10:10:24,000 root         DEBUG    Gstreamer PTS: 0:00:35.283822
2020-11-19 10:10:24,001 root         DEBUG    Num frames in nvstreammux:1
0:00:55.183927826 e[331m 1452e[00m 0x7f6650028050 e[37mDEBUG  e[00m e[00m        v4l2videodec gstv4l2videodec.c:1277:gst_v4l2_video_dec_handle_frame:<nvv4l2decoder0>e[00m Handling frame 444
0:00:55.186687589 e[331m 1452e[00m 0x7f662d65cd90 e[33;01mLOG    e[00m e[00m        v4l2videodec gstv4l2videodec.c:1124:gst_v4l2_video_dec_loop:<nvv4l2decoder0>e[00m Process output buffer
0:00:55.187012020 e[331m 1452e[00m 0x7f662d65cd90 e[33;01mLOG    e[00m e[00m         nvstreammux gstnvstreammux.c:362:gst_nvstreammux_chain:<Stream-muxer>e[00m Got buffer 0x7f662d66620:00:55.206716258 e[331m 1452e[00m 0x7f662d65cd90 e[33;01mLOG    e[00m e[00m        v4l2videodec gstv4l2videodec.c:1102:gst_v4l2_video_dec_loop:<nvv4l2decoder0>e[00m Allocate output buffer
end of python -END OF THE SECOND APPSINK
0:00:55.207905338 e[331m 1452e[00m 0x7f662c054590 e[37mTRACE  e[00m e[00m         nvstreammux gstnvstreammux.c:1572:gst_nvstreammux_src_collect_buffers:<Stream-muxer>e[00m SETTING CUDA DEVICE = 0 in nvstreammux  func=gst_nvstreammux_src_collect_buffers

0:00:55.207956737 e[331m 1452e[00m 0x7f662c054590 e[37mDEBUG  e[00m e[00m         nvstreammux gstnvstreammux.c:2196:gst_nvstreammux_src_push_loop:<Stream-muxer>e[00m Pushing buffer 0x7f6650054360, batch size 1, PTS 0:00:35.618957833
0:00:55.207966810 e[331m 1452e[00m 0x7f662c054590 e[37mDEBUG  e[00m e[00m         nvstreammux gstnvstreammux.c:2218:gst_nvstreammux_src_push_loop:<Stream-muxer>e[00m STREAMMUX OUT BUFFER attached timestamp 0:00:35.618957833
hello pad - SAME TEST PAD AS ABOVE
sample pulled - INSIDE APPSINK, successfull pull (I seem to have accidentally deleted previous log line)
2020-11-19 10:10:24,097 root         DEBUG    Gstreamer PTS: 0:00:35.361247
2020-11-19 10:10:24,098 root         DEBUG    Num frames in nvstreammux:1
0:00:55.264765882 e[331m 1452e[00m 0x7f6650028050 e[37mDEBUG  e[00m e[00m        v4l2videodec gstv4l2videodec.c:1277:gst_v4l2_video_dec_handle_frame:<nvv4l2decoder0>e[00m Handling frame 445
0:00:55.265107193 e[331m 1452e[00m 0x7f662d65cd90 e[33;01mLOG    e[00m e[00m        v4l2videodec gstv4l2videodec.c:1124:gst_v4l2_video_dec_loop:<nvv4l2decoder0>e[00m Process output buffer
0:00:55.265243073 e[331m 1452e[00m 0x7f662d65cd90 e[33;01mLOG    e[00m e[00m         nvstreammux gstnvstreammux.c:362:gst_nvstreammux_chain:<Stream-muxer>e[00m Got buffer 0x7f66500a230:00:55.304396534 e[331m 1452e[00m 0x7f662c054590 e[37mTRACE  e[00m e[00m         nvstreammux gstnvstreammux.c:1572:gst_nvstreammux_src_collect_buffers:<Stream-muxer>e[00m SETTING CUDA DEVICE = 0 in nvstreammux  func=gst_nvstreammux_src_collect_buffers

0:00:55.304558162 e[331m 1452e[00m 0x7f662c054590 e[37mTRACE  e[00m e[00m         nvstreammux gstnvstreammux.c:1572:gst_nvstreammux_src_collect_buffers:<Stream-muxer>e[00m SETTING CUDA DEVICE = 0 in nvstreammux  func=gst_nvstreammux_src_collect_buffers

0:00:55.304569642 e[331m 1452e[00m 0x7f662d65cd90 e[33;01mLOG    e[00m e[00m        v4l2videodec gstv4l2videodec.c:1102:gst_v4l2_video_dec_loop:<nvv4l2decoder0>e[00m Allocate output buffer
0:00:55.304612953 e[331m 1452e[00m 0x7f662c054590 e[37mDEBUG  e[00m e[00m         nvstreammux gstnvstreammux.c:2196:gst_nvstreammux_src_push_loop:<Stream-muxer>e[00m Pushing buffer 0x7f662c0c86d0, batch size 1, PTS 0:00:35.715613419
0:00:55.304622075 e[331m 1452e[00m 0x7f662c054590 e[37mDEBUG  e[00m e[00m         nvstreammux gstnvstreammux.c:2218:gst_nvstreammux_src_push_loop:<Stream-muxer>e[00m STREAMMUX OUT BUFFER attached timestamp 0:00:35.715613419
end of python - END OF THE THIRD APPSINK
hello pad - PAD PROBE TEST AGAIN ON YOLO SINK PAD
 1452e[00m 0x7f6650028050 e[37mDEBUG  e[00m e[00m        v4l2videodec gstv4l2videodec.c:1277:gst_v4l2_video_dec_handle_frame:<nvv4l2decoder0>e[00m Handling frame 446
0:00:55.346068823 e[331m 1452e[00m 0x7f662d65cd90 e[33;01mLOG    e[00m e[00m        v4l2videodec gstv4l2videodec.c:1124:gst_v4l2_video_dec_loop:<nvv4l2decoder0>e[00m Process output buffer
0:00:55.346185086 e[331m 1452e[00m 0x7f662d65cd90 e[33;01mLOG    e[00m e[00m         nvstreammux gstnvstreammux.c:362:gst_nvstreammux_chain:<Stream-muxer>e[00m Got buffer 0x7f662c0c8b10 from source 0

Now, after this, I’d expect to re-enter the appsink. But it’s not happening, that’s what I mean.

So the last bit is (copied from above)

end of python - END OF THE THIRD APPSINK
hello pad - PAD PROBE TEST AGAIN ON YOLO SINK PAD
 1452e[00m 0x7f6650028050 e[37mDEBUG  e[00m e[00m        v4l2videodec gstv4l2videodec.c:1277:gst_v4l2_video_dec_handle_frame:<nvv4l2decoder0>e[00m Handling frame 446
0:00:55.346068823 e[331m 1452e[00m 0x7f662d65cd90 e[33;01mLOG    e[00m e[00m        v4l2videodec gstv4l2videodec.c:1124:gst_v4l2_video_dec_loop:<nvv4l2decoder0>e[00m Process output buffer
0:00:55.346185086 e[331m 1452e[00m 0x7f662d65cd90 e[33;01mLOG    e[00m e[00m         nvstreammux gstnvstreammux.c:362:gst_nvstreammux_chain:<Stream-muxer>e[00m Got buffer 0x7f662c0c8b10 from source 0

where you see after the “hello pad” line we get another log from v4l2videodec, which is what I meant by (video decoder hits again).

The expected log here would be what happens in the beginning of the log:
end of python - END OF ONE APPSINK
1452e[00m 0x7f6650028050 e[37mDEBUG e[00m e[00m v4l2videodec gstv4l2videodec.c:1277:gst_v4l2_video_dec_handle_frame:e[00m Handling frame 443
0:00:55.114987577 e[331m 1452e[00m 0x7f662d65cd90 e[33;01mLOG e[00m e[00m v4l2videodec gstv4l2videodec.c:1124:gst_v4l2_video_dec_loop:e[00m Process output buffer
0:00:55.115342670 e[331m 1452e[00m 0x7f662d65cd90 e[33;01mLOG e[00m e[00m nvstreammux gstnvstreammux.c:362:gst_nvstreammux_chain:e[00m Got buffer 0x7f662c0c8180 from source 0
0:00:55.115465645 e[331m 1452e[00m 0x7f662d65cd90 e[33;01mLOG e[00m e[00m v4l2videodec gstv4l2videodec.c:1102:gst_v4l2_video_dec_loop:e[00m Allocate output buffer
0:00:55.115475643 e[331m 1452e[00m 0x7f662c054590 e[37mTRACE e[00m e[00m nvstreammux gstnvstreammux.c:1572:gst_nvstreammux_src_collect_buffers:e[00m SETTING CUDA DEVICE = 0 in nvstreammux func=gst_nvstreammux_src_collect_buffers

0:00:55.115529802 e[331m 1452e[00m 0x7f662c054590 e[37mDEBUG  e[00m e[00m         nvstreammux gstnvstreammux.c:2196:gst_nvstreammux_src_push_loop:<Stream-muxer>e[00m Pushing buffer 0x7f662c0c84b0, batch size 1, PTS 0:00:35.526529880
0:00:55.115540112 e[331m 1452e[00m 0x7f662c054590 e[37mDEBUG  e[00m e[00m         nvstreammux gstnvstreammux.c:2218:gst_nvstreammux_src_push_loop:<Stream-muxer>e[00m STREAMMUX OUT BUFFER attached timestamp 0:00:35.526529880
hello pad - A PAD PROBE TEST ON THE YOLO SINK PAD
Inside pull sample - APPSINK ENTRYPOINT

ALl these logs between end of python might be missing, and also you cansee that we go from “hello pad” to “inside pull sample”, i.e. the pad probe of the yolo model is just before the entrypoint of the appsink, which is what we’d expect, since the pipeline end with yolo-model -> appsink

I see it. What happens after 0:00:55.346185086? Is nvstreammux log still output?

No logs from nvstreammux, or the decoder after this point (nor GST_BUFFER!)

only logs after this point are from rtspsrc and gst_memory at DEBUG level if you chose to log them.
As I said before, it allocates memory, but does not free it. I.e. i grabs rtsp packets but it doesnt send it further in the pipeline.

To be even clearer: All the logs I’ve posted stop at the point I stopped showing you the logs.
There are no lines of logging after the ones I’ve posted

So that means no frame data from rtsp source after that. Have you checked rtspsrc log? Please only enable rtspsrc log with “GST_DEBUG=rtspsrc:5” and check the log.

No, you misunderstand me. There is rtsp data after that (and my pipeline without tensorrt works fine).

I can record that log for you but I’ve got no good place to stop it because it will not stop, and it records a lot of logs…

Can you set “interval=10” for your nvinfer plugin and try?

Eventually it still fails, just the python logs ends up being:

end of python
hello pad
Inside pull sample
sample pulled
2020-11-19 11:40:47,576 root         DEBUG    Gstreamer PTS: 0:00:20.135236
2020-11-19 11:40:47,577 root         DEBUG    Num frames in nvstreammux:1
end of python
hello pad
Inside pull sample
sample pulled
2020-11-19 11:40:47,656 root         DEBUG    Gstreamer PTS: 0:00:20.215272
2020-11-19 11:40:47,657 root         DEBUG    Num frames in nvstreammux:1
end of python
hello pad
hello pad
hello pad
hello pad

What are you doing with the sample buffer in appsink? Will it take long time to do the processing? Will this happen with a file source(a long video)?

Can you share your model and script for us to reproduce the problem?

We’re running pytorch models, extracting some data from tensorrt and sending it off to a different server.
But the issue is reproduced even with a dummy appsink (it doesnt even have to access the sample).

I’ve not tried on a T4 with more than maybe 5 minutes of video.

You just have to make an appisnk, say
appsink = Gst.ElementFactory.make(“appsink”, “appsink”)
appsink.set_property(‘emit-signals’, True)
appsink.set_property(‘enable-last-sample’, False)
appsink.set_property(‘max-buffers’, 10)
appsink.set_property(‘sync’, False)
appsink.connect(‘new-sample’, appsink_function)

where appsink_function is:

def appsink_function(buffer):
print(“inside appsink”)
return Gst.FlowReturn.OK

I’ll upload a full script in a few minutes to reproduce

Any preferred channel for uploading the model?
It’s really just the standard yolov4 model (we also have a custom one, but standard reproduces) from the Nvidia github repo I wrote somewhere in the first post.

Is Github OK for you?

@Fiona.Chen

Sorry, got a bit delayed

I’ve got the code here, I just need to find a way of uploading the engine since it seems to be to big for my github.

ok, they’re in the github repo now over git lfs

Hi @Fiona.Chen

I just wanted to confirm that you had reproduced the issue with what I provided, or if I have to provide any more info.

Thanks

I can’t use your engine on my T4.

I’ve uploaded the onnx model used, and a script with the trtexec options used, if you could try that.