while running model using trtexec --fp16 mode, log is showing like precision: fp16+fp32. is it because of inputs and outputs are in fp32 or it will run some nodes in fp32
TensorRT internally picks a faster kernel among FP16 and FP32. If TRT observes that FP16 is slower than FP32 for some layers, it will fall back to FP32.
What i have to do if i want run all nodes in fp16