Automatic Mixed Precision for NVIDIA Tensor Core Architecture in TensorFlow

Originally published at:

Whether to employ mixed precision to train your TensorFlow models is no longer a tough decision. NVIDIA’s Automatic Mixed Precision (AMP) feature for TensorFlow, recently announced at the 2019 GTC, features automatic mixed precision training by making all the required model and optimizer adjustments internally within TensorFlow with minimal programmer intervention. Performance increases using automatic mixed precision…

I tried it and it failed. I posted in the developers forum, not sure if this is the best place.

If it doesn't work, then try to use dimensions of your layers divisible by 8 . In our case it helped a lot.

Hi @jwitsoe

I run AMP using both Python and C++.
I found that there is the difference between number of converted nodes in C++ and Python.
(no of converted nodes of Python > C++ => Python AMP run faster than C++)
Why does this happen?


Hi @nha.tuan84,

Can you provide more details about your use case? What version of TensorFlow are you using, are you using AMP for training or inference? In the C++ case, are you building up the model from scratch in C++, or executing a graph saved from python? If you are training in the C++ API, have you manually implemented loss scaling? And what APIs are you using to enable AMP?


1 Like

Thanks @nluehr

I am rebuilding TF for xavier to check this.