I plan to install linpack for Fermi on Linux CentOS.
I followed the install guidance:
1)install MPI 1.4.2
2)install Intel MKL 10.X
3)CUDA 3.0
4)Tesla Fermi cards
Then I used the Make.CUDA_pinned file and editd:
This is just a sample Make.
The user may need to edit:
1.) TOPdir
2.) MPI variables (MPdir,MPinc,MPlib)
3.) MKL BLAS variables (LAdir, LAinc, LAlib)
4.) The Compiler and Compiler/Linker Options (CC,CCFLAGS)
– High Performance Computing Linpack Benchmark (HPL)
HPL - 1.0a - January 20, 2004
Antoine P. Petitet
University of Tennessee, Knoxville
Innovative Computing Laboratories
© Copyright 2000-2004 All Rights Reserved
– Copyright notice and Licensing terms:
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions, and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. All advertising materials mentioning features or use of this
software must display the following acknowledgement:
This product includes software developed at the University of
Tennessee, Knoxville, Innovative Computing Laboratories.
4. The name of the University, the name of the Laboratory, or the
names of its contributors may not be used to endorse or promote
products derived from this software without specific written
permission.
– Disclaimer:
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
``AS IS’’ AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE UNIVERSITY
OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
##########
----------------------------------------------------------------------
- shell --------------------------------------------------------------
----------------------------------------------------------------------
SHELL = /bin/sh
CD = cd
CP = cp
LN_S = ln -fs
MKDIR = mkdir -p
RM = /bin/rm -f
TOUCH = touch
----------------------------------------------------------------------
- Platform identifier ------------------------------------------------
----------------------------------------------------------------------
ARCH = CUDA_pinned
----------------------------------------------------------------------
- HPL Directory Structure / HPL library ------------------------------
----------------------------------------------------------------------
Set TOPdir to the location of where this is being built
ifndef TOPdir
#TOPdir = pwd
TOPdir = /home/hpl-2.0_FERMI_v04
endif
INCdir = $(TOPdir)/include
BINdir = $(TOPdir)/bin/$(ARCH)
LIBdir = $(TOPdir)/lib/$(ARCH)
HPLlib = $(LIBdir)/libhpl.a
----------------------------------------------------------------------
- Message Passing library (MPI) --------------------------------------
----------------------------------------------------------------------
MPinc tells the C compiler where to find the Message Passing library
header files, MPlib is defined to be the name of the library to be
used. The variable MPdir is only used for defining MPinc and MPlib.
MPdir = /usr/local/openmpi
MPinc = -I$(MPdir)/include
MPlib = $(MPdir)/lib/libvt.mpi.a
#MPlib = $(MPdir)/lib64/libmpich.a
----------------------------------------------------------------------
- Linear Algebra library (BLAS) -----------------------------
----------------------------------------------------------------------
LAinc tells the C compiler where to find the Linear Algebra library
header files, LAlib is defined to be the name of the library to be
used. The variable LAdir is only used for defining LAinc and LAlib.
#LAdir = $(TOPdir)/…/…/lib/em64t
LAdir = /opt/intel/mkl/10.2.5.035/lib/em64t
#LAdir = /share/apps/intel/mkl/10.1.0.99/lib/em64t
#LAdir = /share/apps/intel/mkl/10.0.4.023/lib/em64t
#LAdir = /share/apps/intel/mkl/10.2.4.032/libem64t
LAinc = -I /opt/intel/mkl/10.2.5.035/include
CUDA
#LAlib = -L /home/cuda/Fortran_Cuda_Blas -ldgemm -L/usr/local/cuda/lib -lcublas -L$(LAdir) -lmkl -lguide -lpthread
LAlib = -L /opt/intel/mkl/10.2.5.035/lib/em64t -lmkl
#LAlib = -L$(LAdir) -lmkl -liomp5
#LAlib = -L$(LAdir) -lmkl $(LAdir)/libguide.a -lpthread
----------------------------------------------------------------------
- F77 / C interface --------------------------------------------------
----------------------------------------------------------------------
You can skip this section if and only if you are not planning to use
a BLAS library featuring a Fortran 77 interface. Otherwise, it is
necessary to fill out the F2CDEFS variable with the appropriate
options. One and only one option should be chosen in each of
the 3 following categories:
1) name space (How C calls a Fortran 77 routine)
-DAdd_ : all lower case and a suffixed underscore (Suns,
Intel, …), [default]
-DNoChange : all lower case (IBM RS6000),
-DUpCase : all upper case (Cray),
-DAdd__ : the FORTRAN compiler in use is f2c.
2) C and Fortran 77 integer mapping
-DF77_INTEGER=int : Fortran 77 INTEGER is a C int, [default]
-DF77_INTEGER=long : Fortran 77 INTEGER is a C long,
-DF77_INTEGER=short : Fortran 77 INTEGER is a C short.
3) Fortran 77 string handling
-DStringSunStyle : The string address is passed at the string loca-
tion on the stack, and the string length is then
passed as an F77_INTEGER after all explicit
stack arguments, [default]
-DStringStructPtr : The address of a structure is passed by a
Fortran 77 string, and the structure is of the
form: struct {char *cp; F77_INTEGER len;},
-DStringStructVal : A structure is passed by value for each Fortran
77 string, and the structure is of the form:
struct {char *cp; F77_INTEGER len;},
-DStringCrayStyle : Special option for Cray machines, which uses
Cray fcd (fortran character descriptor) for
interoperation.
F2CDEFS = -DAdd__ -DF77_INTEGER=int -DStringSunStyle
----------------------------------------------------------------------
- HPL includes / libraries / specifics -------------------------------
----------------------------------------------------------------------
HPL_INCLUDES = -I$(INCdir) -I$(INCdir)/$(ARCH) $(LAinc) $(MPinc) -I/usr/local/cuda/include
HPL_LIBS = $(HPLlib) $(LAlib) $(MPlib)
- Compile time options -----------------------------------------------
-DHPL_COPY_L force the copy of the panel L before bcast;
-DHPL_CALL_CBLAS call the cblas interface;
-DHPL_DETAILED_TIMING enable detailed timers;
-DASYOUGO enable timing information as you go (nonintrusive)
-DASYOUGO2 slightly intrusive timing information
-DASYOUGO2_DISPLAY display detailed DGEMM information
-DENDEARLY end the problem early
-DFASTSWAP insert to use DLASWP instead of HPL code
By default HPL will:
*) not copy L before broadcast,
*) call the BLAS Fortran 77 interface,
*) not display detailed timing information.
HPL_OPTS = -DCUDA_PINNED
----------------------------------------------------------------------
HPL_DEFS = $(F2CDEFS) $(HPL_OPTS) $(HPL_INCLUDES)
----------------------------------------------------------------------
- Compilers / linkers - Optimization flags ---------------------------
----------------------------------------------------------------------
next two lines for GNU Compilers:
CC = mpicc
CCFLAGS = $(HPL_DEFS) -fomit-frame-pointer -O3 -funroll-loops -W -Wall
next two lines for Intel Compilers:
CC = mpicc
CCFLAGS = $(HPL_DEFS) -O3 -axS -w -fomit-frame-pointer -funroll-loops
CCNOOPT = $(HPL_DEFS) -O0 -w
On some platforms, it is necessary to use the Fortran linker to find
the Fortran internals used in the BLAS library.
LINKER = $(CC)
#LINKFLAGS = $(CCFLAGS) -static_mpi
LINKFLAGS = $(CCFLAGS)
ARCHIVER = ar
ARFLAGS = r
RANLIB = echo
----------------------------------------------------------------------
MAKE = make TOPdir=$(TOPdir)
then Compile with the command: " make arch=CUDA_pinned"
it seems something wrong:
HPL_pdtest.o: In function HPL_pdtest': HPL_pdtest.c:(.text+0x117): undefined reference to
assignDeviceToProcess’
HPL_pdtest.c:(.text+0x148): undefined reference to cudaMallocHost' HPL_pdtest.c:(.text+0x770): undefined reference to
cudaFreeHost’
/home/hpl-2.0_FERMI_v04/lib/CUDA_pinned/libhpl.a(HPL_pdpanel_init.o): In function HPL_pdpanel_init': HPL_pdpanel_init.c:(.text+0x2b9): undefined reference to
cudaMallocHost’
HPL_pdpanel_init.c:(.text+0x38b): undefined reference to cudaMallocHost' HPL_pdpanel_init.c:(.text+0x480): undefined reference to
cudaMallocHost’
/home/hpl-2.0_FERMI_v04/lib/CUDA_pinned/libhpl.a(HPL_pdpanel_free.o): In function HPL_pdpanel_free': HPL_pdpanel_free.c:(.text+0x28): undefined reference to
cudaFreeHost’
HPL_pdpanel_free.c:(.text+0x36): undefined reference to cudaFreeHost' /usr/local/openmpi/lib/libvt.mpi.a(libvt_mpi_a-vt_otf_gen.o): In function
VTGen_flush’:
vt_otf_gen.c:(.text+0x49c): undefined reference to OTF_WStream_writeDefProcess' vt_otf_gen.c:(.text+0x4f3): undefined reference to
OTF_WStream_writeDefProcessGroup’
vt_otf_gen.c:(.text+0x51e): undefined reference to OTF_WStream_writeDefinitionComment' vt_otf_gen.c:(.text+0x556): undefined reference to
OTF_WStream_writeEventComment’
vt_otf_gen.c:(.text+0x591): undefined reference to OTF_WStream_writeCounter' vt_otf_gen.c:(.text+0x5e1): undefined reference to
OTF_WStream_writeFileOperation’
vt_otf_gen.c:(.text+0x624): undefined reference to OTF_WStream_writeCounter' vt_otf_gen.c:(.text+0x659): undefined reference to
OTF_WStream_writeLeave’
vt_otf_gen.c:(.text+0x68b): undefined reference to OTF_WStream_writeEnter' vt_otf_gen.c:(.text+0x6d1): undefined reference to
OTF_WStream_writeCounter’
vt_otf_gen.c:(.text+0x6f6): undefined reference to OTF_WStream_writeDefProcessGroup' vt_otf_gen.c:(.text+0x729): undefined reference to
OTF_WStream_writeDefCounter’
vt_otf_gen.c:(.text+0x751): undefined reference to OTF_WStream_writeDefCounterGroup' vt_otf_gen.c:(.text+0x798): undefined reference to
OTF_WStream_writeFunctionSummary’
vt_otf_gen.c:(.text+0x7ee): undefined reference to OTF_WStream_writeCollectiveOperation' vt_otf_gen.c:(.text+0x824): undefined reference to
OTF_WStream_writeRecvMsg’
vt_otf_gen.c:(.text+0x85a): undefined reference to OTF_WStream_writeSendMsg' vt_otf_gen.c:(.text+0x8bc): undefined reference to
OTF_WStream_writeFileOperationSummary’
vt_otf_gen.c:(.text+0x915): undefined reference to OTF_WStream_writeMessageSummary' vt_otf_gen.c:(.text+0x92e): undefined reference to
OTF_WStream_writeDefCollectiveOperation’
vt_otf_gen.c:(.text+0x954): undefined reference to OTF_WStream_writeDefFunction' vt_otf_gen.c:(.text+0x973): undefined reference to
OTF_WStream_writeDefFunctionGroup’
vt_otf_gen.c:(.text+0x995): undefined reference to OTF_WStream_writeDefFile' vt_otf_gen.c:(.text+0x9b4): undefined reference to
OTF_WStream_writeDefFileGroup’
vt_otf_gen.c:(.text+0x9d5): undefined reference to OTF_WStream_writeDefScl' vt_otf_gen.c:(.text+0x9eb): undefined reference to
OTF_WStream_writeDefSclFile’
vt_otf_gen.c:(.text+0xa3a): undefined reference to OTF_WStream_writeOtfVersion' vt_otf_gen.c:(.text+0xa47): undefined reference to
OTF_WStream_writeDefCreator’
vt_otf_gen.c:(.text+0xa54): undefined reference to OTF_WStream_writeDefTimerResolution' /usr/local/openmpi/lib/libvt.mpi.a(libvt_mpi_a-vt_otf_gen.o): In function
VTGen_close’:
vt_otf_gen.c:(.text+0x2e00): undefined reference to OTF_WStream_close' /usr/local/openmpi/lib/libvt.mpi.a(libvt_mpi_a-vt_otf_gen.o): In function
VTGen_open’:
vt_otf_gen.c:(.text+0x2ec8): undefined reference to OTF_FileManager_open' vt_otf_gen.c:(.text+0x2ed9): undefined reference to
OTF_WStream_open’
vt_otf_gen.c:(.text+0x303d): undefined reference to OTF_WStream_setCompression' /usr/local/openmpi/lib/libvt.mpi.a(libvt_mpi_a-vt_otf_gen.o): In function
VTGen_delete’:
vt_otf_gen.c:(.text+0x30e9): undefined reference to OTF_getFilename' vt_otf_gen.c:(.text+0x310d): undefined reference to
OTF_getFilename’
vt_otf_gen.c:(.text+0x3132): undefined reference to OTF_getFilename' vt_otf_gen.c:(.text+0x31c1): undefined reference to
OTF_FileManager_close’
/usr/local/openmpi/lib/libvt.mpi.a(libvt_mpi_a-vt_otf_gen.o): In function VTGen_get_statname': vt_otf_gen.c:(.text+0x269): undefined reference to
OTF_getFilename’
/usr/local/openmpi/lib/libvt.mpi.a(libvt_mpi_a-vt_otf_gen.o): In function VTGen_get_eventname': vt_otf_gen.c:(.text+0x289): undefined reference to
OTF_getFilename’
/usr/local/openmpi/lib/libvt.mpi.a(libvt_mpi_a-vt_otf_gen.o): In function VTGen_get_defname': vt_otf_gen.c:(.text+0x2a9): undefined reference to
OTF_getFilename’
/opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to __kmpc_ok_to_fork' /opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to
__kmpc_end_single’
/opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to __kmpc_ordered' /opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to
__kmpc_for_static_init_8’
/opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to omp_get_thread_num' /opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to
__kmpc_barrier’
/opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to omp_get_num_threads' /opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to
omp_get_num_procs’
/opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to __kmpc_dispatch_next_4' /opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to
__kmpc_end_reduce_nowait’
/opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to __kmpc_critical' /opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to
__kmpc_dispatch_fini_8’
/opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to __kmpc_serialized_parallel' /opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to
__kmpc_end_critical’
/opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to __kmpc_dispatch_init_8' /opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to
ompc_set_nested’
/opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to omp_get_nested' /opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to
__kmpc_dispatch_fini_4’
/opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to omp_in_parallel' /opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to
__kmpc_push_num_threads’
/opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to __kmpc_reduce_nowait' /opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to
omp_get_max_threads’
/opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to __kmpc_for_static_init_4' /opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to
__kmpc_end_serialized_parallel’
/opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to __kmpc_flush' /opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to
__kmpc_single’
/opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to __kmpc_dispatch_next_8' /opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to
__kmpc_dispatch_init_4’
/opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to __kmpc_global_thread_num' /opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to
__kmpc_end_ordered’
/opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to __kmpc_fork_call' /opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to
__kmpc_atomic_fixed8_add’
/opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.so: undefined reference to __kmpc_for_static_fini' collect2: ld returned 1 exit status make[2]: *** [dexe.grd] Error 1 make[2]: Leaving directory
/home/hpl-2.0_FERMI_v04/testing/ptest/CUDA_pinned’
make[1]: *** [build_tst] Error 2
make[1]: Leaving directory `/home/hpl-2.0_FERMI_v04’
make: *** [build] Error 2
could you tell me what’s wrong with it?