When does horizontal fusion happen for TensorRT's graph optimization?


I am curious when does horizontal fusion happen during the graph optimization step of TensorRT. I tried to merge two models into a bigger model and the ops from each model are independent. So theoretically the same ops from different submodels have chances to do horizontal fusion. (My original thought is TensorRT will topological sort the ops and check all possible horizontal fusion opportunities)

But based on the profiling results, the execution plan doesn’t include any horizontal fusion even if the two submodels are the same. I further checked what happens if the two submodels share one input and found that the first layer from two models will get fused. Therefore, I guess TensorRT only consider horizontal fusion when there is one op has multiple outputs (in other words, there is a branch). Am I right? Thanks!


Currently, Horizontal Fusion only happens when there are two Conv ops or two Gemm ops that share the same input, and the two ops have mergeable parameters (e.g. same kernel size, same activation, etc).

Thank you.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.