Hello everyone,
I’m struggling to run the CUDA accelerated Linpack benchmark on my university’s cluster. I got the benchmark from here: https://developer.nvidia.com/rdp/assets/cuda-accelerated-linpack-linux64
Building it seems to work, I do not get any errors there. But when I try to run the benchmark with
mpirun -np 4 ./run_linpack
I get the following output and error:
================================================================================
HPLinpack 2.0 -- High-Performance Linpack benchmark -- September 10, 2008
Written by A. Petitet and R. Clint Whaley, Innovative Computing Laboratory, UTK
Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK
Modified by Julien Langou, University of Colorado Denver
================================================================================
An explanation of the input/output parameters follows:
T/V : Wall time / encoded variant.
N : The order of the coefficient matrix A.
NB : The partitioning blocking factor.
P : The number of process rows.
Q : The number of process columns.
Time : Time in seconds to solve the linear system.
Gflops : Rate of execution for solving the linear system.
The following parameter values will be used:
N : 25000
NB : 768
PMAP : Row-major process mapping
P : 2
Q : 2
PFACT : Left
NBMIN : 2
NDIV : 2
RFACT : Left
BCAST : 1ring
DEPTH : 1
SWAP : Spread-roll (long)
L1 : no-transposed form
U : no-transposed form
EQUIL : yes
ALIGN : 8 double precision words
--------------------------------------------------------------------------------
- The matrix A is randomly generated for each test.
- The following scaled residual check will be computed:
||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N )
- The relative machine precision (eps) is taken to be 1.110223e-16
- Computational tests pass if scaled residuals are less than 16.0
/cluster/lib/openblas/lib/libopenblas.so.0: undefined symbol: dtrsm
/cluster/lib/openblas/lib/libopenblas.so.0: undefined symbol: dtrsm
-------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
There seems to be some problem with the OPENBLAS library (undefined symbol: dtrsm). I was unsuccessfull to fix that and couldn’t find any help online yet. I hope someone here has an idea what I should do/try next.
If you need more information to assist me I will gladly provide them.
Kind regards
Lukas