Hello Community,
I am trying to use NVHSMEM on a cluster that I work on, currently I have used spack to install nvshmem using the command “$ spack install nvshmem +gpu_initiated_support +cuda +gdrcopy +ucx +mpi ^openmpi +cuda fabrics=ucx schedulers=slurm ^ucx +cuda +gdrcopy +dm +thread_multiple”
To test if it’s working I am using the example Hello World code, I compile successfully using nvcc -rdc=true -ccbin g++ -gencode=$NVCC_GENCODE -I $NVSHMEM_HOME/include nvshmemHelloWorld.cu -o nvshmemHelloWorld.out -L $NVSHMEM_HOME/lib -lnvshmem -lnvidia-ml -lcuda -lcudart -L $UCX_HOME/lib -lucs -lucp -lmlx5
When I try to run it I get: /dvs/p4/build/sw/rel/gpgpu/toolkit/r12.0/main_nvshmem/src/modules/bootstrap/pmi/bootstrap_pmi.cpp:nvshmemi_bootstrap_plugin_init:362: PMI bootstrap version (20800) is not compatible with NVSHMEM version (20700)
I think that this means that I should be using a different version of PMI, I have changed PMI_HOME to no effect. How do I change where the system looks for PMI? Alturnatively this isn’t the issue and I’m not understanding something?