Want to know how to convert the trained model weights from FP32 to FP16

Hi all, I want to know how to convert a trained model’s weights from FP32 to FP16
I have been using the sample code from the directory
/usr/src/tensorrt/samples/python/network_api_pytorch_mnist/
The files used are sample.py and model.py
In both files the model’s weight is FP32. To see the effect of reducing the weights to FP16, now I want to convert them from FP32 to FP16 for inference.
I want to know how I can do that. Your early reply will definitely hep me.

Thanks and Regards

Nagaraj Trivedi

Dear @trivedi.nagaraj,
Could you check setting the builder config to use FP16 mode like config.set_flag(trt.BuilderFlag.FP16) in build_engine(). You may check efficientdet and uff_ssd samples for reference.

OK. Thank you. I will refer them and let you know any further doubts I have.

Thanks and Regards

Nagaraj Trivedi

Hi SivaRamaKrishnan, I have verified the uff_ssd sample files. Yes, configuring the engine for FP16 is present.
But I have another doubt that, referring to the sample program in the directory
/usr/src/tensorrt/samples/python/network_api_pytorch_mnist/
where it gets the trained weights in FP32 format. My clarification is that apart from making use of
config.set_flag(trt.BuilderFlag.FP16) do we need to convert the trained weights from FP32 to FP16 before it is fed to all the layers in the network. For example,
conv1_w = weights[‘conv1.weight’].numpy()
conv1_b = weights[‘conv1.bias’].numpy()

Do we need to convert the above weights to FP16 before being assigned to conv1_w and conv1_b ?. Please clarify.

Thanks and Regards

Nagaraj Trivedi

Dear @trivedi.nagaraj,
Do we need to convert the above weights to FP16 before being assigned to conv1_w and conv1_b ?
No. config.set_flag(trt.BuilderFlag.FP16) generates FP16 TRT model.

OK. Thank you. In that case how about this line
input_tensor = network.add_input(name=ModelData.INPUT_NAME, dtype=ModelData.DTYPE, shape=ModelData.INPUT_SHAPE)

Here the ModelData.DTYPE is DTYPE = trt.float32

Do I need to change this also to trt.float16? . Please clarify.

Thanks and Regards

Nagaraj Trivedi

No. Input and output will be FP32. Rest of the layers gets converted to FP16 automatically.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.