I constructed a simple network as bellow:
0: (Unnamed Layer* 0) [Convolution], 0
1: (Unnamed Layer* 1) [Convolution], 0
which are from the first two convolution layers of mobilenetV2.
I set the input size to 384*768, batchsize to 1, and set to int8 mode,
the time cost is:
3.750966ms on DLA
1.342228ms on GPU
I am very confused by this result. Could you please tell me why this happens?
Thanks.