Hello,
I’m trying to use the apex automatic mixed precision on an ensemble of 2 models connected serially.
I’m testing the opt_level = 02 and indeed I observe the input and output of the model internally converted to half precision.
However, after the forward step the input/output tensors used to calculate the loss are again converted to single precision.
I expect the data tensors and weights of the model to be half precision all the way, such that the optimizer works on 16 bit tensors. Is this the correct behavior?
Thank you,
Alex.