Intermediate outputs on-demand


Say I have a classical encoder-decoder segmentation network, U-net like style. Also assume that the model has two outputs: one low-res output and one high-res output. The nature of this model implies that the high-res output is dependent on the low-res output and thus has a higher latency.

In PyTorch it is possible to query the low-res output before the high-res output has been computed. I was wondering if this is possible in TensorRT, so you don’t have to wait for entire computational graph to be computed, if you are just interested in the low-res output.


There is no such in-build functionality in TRT but you can try plugin approach if it works.

Create plugin to send out an event and use another stream to wait for that event.
Something like:
A – out1 – B – out2 => A --out1 – plugin – B --out2
I haven’t tried this approach, but it might work.