I have a very simple question. I have just recently got access to a multi node machine and I have to do some NCCL tests. In the readme it says
If CUDA is not installed in /usr/local/cuda, you may specify CUDA_HOME. Similarly, if NCCL is not installed in /usr, you may specify NCCL_HOME.
I can see that CUDA is installed but
How can I know if NCCL is installed ? and where?
I have done
find /usr -name "libnccl.so*" 2>/dev/null
and I found this file. However when I did
find /usr -name "nccl.h" 2>/dev/null
it was not found. Obviously I could not build even the simplest
#include <stdio.h>
#include <nccl.h>
int main() {
printf("NCCL version: %d\n", NCCL_VERSION_CODE);
return 0;
}
(Btw, I think the OS is CentOS)