In Azure, I am running the gen 2 version of
NVIDIA Image for AI using GPUs on an ND96asr_v4 with 8xA100’s. I am training models on this node only in an NGC container. When I run the HPC pre-flight-check container everything checks out except nv_peer_mem
version cannot be determined. I assume this is because it is not installed. Is it supposed to be included in this image by default? Does peer memory help with on-board GPUs or is it only for multiple nodes?