How to utilize tensorcore when designing the network

Hi,

I am doing the channel pruning for some of the network, but i found out the network after pruning was not accelerated by tensorrt.

After using nvprof i found out most of the convolution layers did not utilize tensorcore, I have made channel dvisible by 8 or 16, but still can’t utilize the tensorcore.

this rises a question, how can i design my network to fully utilize tensorcore of tensorrt? I did not find any guidelines, and auto turing for tensorrt is really a black box.

The other quesion is that do i really need to care about the channel number to be divisble by 8 or 16 when conveting the model to tensorrt?

Hi,

Please refer to the following documentation, which may help you.

Thank you.