I downloaded LINPACK benchmark software from
https://developer.nvidia.com/rdp/assets/cuda-accelerated-linpack-linux64
and tried to run on our Tesla-P100 machine (3GPUs on board)
I can successfully run the benchmark software in single process. But when I use mpirun -np 2 ./run_linpack
it blocks (even for -np 1
) I found one process takes 100% of CPU and the other is almost idle. I installed openmpi on ubuntu16.04.
Maybe I missed something. I appreciate your suggestions.