TAO training on multiple gpus failed

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

You can "docker pull " one of below dockers to narrow down.

Then use below to run inside the docker.
$ docker run --runtime=nvidia -it --rm <docker name> /bin/bash