why the kHALF mode is slower then kFLOAT on TX2?

I compared the time consumption of kHALF and kFLOAT mode at cifar10 and bvlc model, the result is :

kHALF          kFLOAT
0.2291 ms        0.2238 ms     cifar10
0.5356 ms        0.3762 ms     bvlc

I found that kHALF mode is slower then kFLOAT mode, could anybody help me for giving the reason?

any help will be appreciate !



I need your help~

anybody can help me ?


Please remember to maximize your GPU/CPU frequency.

sudo ./jetson_clocks.sh

Here is some profiling data when TensorRT-2.1 launched for your reference:

GoogLeNet, Max-P 	5.6ms
ResNet-50, Max-P 	12.2ms