I have Quadro P4000. when launching the same workload WDDM mode shows better performance than TCC
I checked the profiler and it seems that driver latency is greatly reduced in TCC mode however kernel execution times are longer in TCC mode for the same kernels.
I checked SM and memory clocks and the seem to be at max in both modes.
What could be the reason for the slowdown when using TCC mode.