Just in order to narrow down. Since you mentioned that the multi gpus was working fine earlier with your dataset but now not working the same, and last week there is a blocking issue(see Chmod: cannot access '/opt/ngccli/ngc': No such file or directory - #2 by Morganh) , I am not sure if that issue will result in your current nan loss issue.
So, if possible, just to run default jupyter notebook(run against KITTI dataset) again to check if it still works.
If it works, that means it is not related to above-mentioned issue. It is needed to check more in your training dataset or parameters.