Using Nvidia apex on models ensemble

I’m trying to use the apex automatic mixed precision on an ensemble of 2 models connected serially.
I’m testing the opt_level = 02 and indeed I observe the input and output of the model internally converted to half precision.
However, after the forward step the input/output tensors used to calculate the loss are again converted to single precision.
I expect the data tensors and weights of the model to be half precision all the way, such that the optimizer works on 16 bit tensors. Is this the correct behavior?

Thank you,

certain aspects of model training will still be done using FP32

You may wish to review this blog:

quoting from there:

In brief, the methodology is:

  1. Ensuring that weight updates are carried out in FP32.