NSight Queue/Submit delta keeps growing on Pascal GPU

Hi there,

I have some existing code for processing live video that I’m testing on a new GPU, a GTX1050Ti.

I’m currently looking at performance in the NSight timeline, and the “Queue / Submit delta” seems to have changed drastically, compared with the old GPU (a GTX 960; also run on GTX 650Ti).

On the old GPU, the queue length and submit depth spiked at the beginning of each cycle, and quickly returned to zero as commands were consumed.

On the new GPU, the submit depth rises continually for the duration of the process (see image). After about 30 seconds, the submit depth is around 900. There’s a small amount of variation in both the queue length and submit depth superimposed on this continual increase.

I’m not sure what to make of this. Has the meaning of the Queue / Submit delta changed on the more recent GPU models? Is it any cause for concern that the submit depth just keeps on rising? Does this mean that some commands aren’t being executed, and just sit on a queue forever?