nvidia-provided LINPACK benchmarking software

I downloaded LINPACK benchmark software from

and tried to run on our Tesla-P100 machine (3GPUs on board)

I can successfully run the benchmark software in single process. But when I use mpirun -np 2 ./run_linpack it blocks (even for -np 1) I found one process takes 100% of CPU and the other is almost idle. I installed openmpi on ubuntu16.04.

Maybe I missed something. I appreciate your suggestions.