Hi all, I want to know how to convert a trained model’s weights from FP32 to FP16
I have been using the sample code from the directory
/usr/src/tensorrt/samples/python/network_api_pytorch_mnist/
The files used are sample.py and model.py
In both files the model’s weight is FP32. To see the effect of reducing the weights to FP16, now I want to convert them from FP32 to FP16 for inference.
I want to know how I can do that. Your early reply will definitely hep me.
Dear @trivedi.nagaraj,
Could you check setting the builder config to use FP16 mode like config.set_flag(trt.BuilderFlag.FP16) in build_engine(). You may check efficientdet and uff_ssd samples for reference.
Hi SivaRamaKrishnan, I have verified the uff_ssd sample files. Yes, configuring the engine for FP16 is present.
But I have another doubt that, referring to the sample program in the directory
/usr/src/tensorrt/samples/python/network_api_pytorch_mnist/
where it gets the trained weights in FP32 format. My clarification is that apart from making use of
config.set_flag(trt.BuilderFlag.FP16) do we need to convert the trained weights from FP32 to FP16 before it is fed to all the layers in the network. For example,
conv1_w = weights[‘conv1.weight’].numpy()
conv1_b = weights[‘conv1.bias’].numpy()
Do we need to convert the above weights to FP16 before being assigned to conv1_w and conv1_b ?. Please clarify.
Dear @trivedi.nagaraj,
Do we need to convert the above weights to FP16 before being assigned to conv1_w and conv1_b ?
No. config.set_flag(trt.BuilderFlag.FP16) generates FP16 TRT model.
OK. Thank you. In that case how about this line
input_tensor = network.add_input(name=ModelData.INPUT_NAME, dtype=ModelData.DTYPE, shape=ModelData.INPUT_SHAPE)
Here the ModelData.DTYPE is DTYPE = trt.float32
Do I need to change this also to trt.float16? . Please clarify.