Dose cuDNN support operator fusion or graph fusion

I want to improve the memory access efficiency when training a network. Is there any available API to support the operator fusion to reduce the data transfer?


There is limited Fused Ops support in 7.6.x.
Please refer below link for more details:


Hi, thanks for your answer. I have figured it out.

One more question, do you have code example for nonlinear network, e.g. resnet, googlenet, written in c++ or cuda from scratch? I am looking into how to implement nonlinear blocks. Thanks