Hello,
I am curious when does horizontal fusion happen during the graph optimization step of TensorRT. I tried to merge two models into a bigger model and the ops from each model are independent. So theoretically the same ops from different submodels have chances to do horizontal fusion. (My original thought is TensorRT will topological sort the ops and check all possible horizontal fusion opportunities)
But based on the profiling results, the execution plan doesn’t include any horizontal fusion even if the two submodels are the same. I further checked what happens if the two submodels share one input and found that the first layer from two models will get fused. Therefore, I guess TensorRT only consider horizontal fusion when there is one op has multiple outputs (in other words, there is a branch). Am I right? Thanks!