Hi,
I am doing the channel pruning for some of the network, but i found out the network after pruning was not accelerated by tensorrt.
After using nvprof i found out most of the convolution layers did not utilize tensorcore, I have made channel dvisible by 8 or 16, but still can’t utilize the tensorcore.
this rises a question, how can i design my network to fully utilize tensorcore of tensorrt? I did not find any guidelines, and auto turing for tensorrt is really a black box.
The other quesion is that do i really need to care about the channel number to be divisble by 8 or 16 when conveting the model to tensorrt?