How To Set Fp16 Mode When Platform Does Not Support Fast FP16?

Hi all,
I set tensorrt’s IBuilder->setFp16Mode(true), but the builder still uses kFLOAT DataType (I set a break point at one of my plugin, and find that DataType is still kFLOAT).

My GPU is TITAN X, which do not support fast fp16. But I just want to debug my code. I will finally run my code on Jetson TX2 (jetpack 3.3).

TensorRT: 4.0
CUDA: 9.0

Hmm, I come here to answer my own question.
After some explore, I find that due to float version is faster than half version on my TITAN X, tensorrt selects float version. So I force use fp16 by return supportedFormat false if type is kFLOAT.


As you said, Tensorrt will select the best format automatically, although you have set it up(set up HALF format but still use FLOAT format)? And do you know why the FLOAT mode is faster than the HALF? I also met this problem.
tanks a lot.

Because there’s not half precision calculation architecture in TITAN SM.