This is a known random issue for faster_rcnn.
For faster_rcnn , please use 22.05-tf1.15.4 docker instead of 22.05-tf1.15.5 docker.
Please open a terminal and run in the terminal.
Command:
$ docker run --runtime=nvidia -it --rm --entrypoint ""
nvcr.io/nvidia/tao/tao-toolkit-tf:v3.22.05-tf1.15.4-py3 /bin/bash
then, inside the docker, run the training command. NOTE: need not use “tao” now.
#
faster_rcnn train xxx
Similar topic: TAO crash after driver update - #3 by dbrazey