Using MAGMA

As part of a project that uses cudafor, I have a need to use MAGMA-1.4.1. Using gcc and gfortran I can generate the libmagma.a file. However, if I use pgcc and pgfortran I encounter problems. The first undesired encounter is that pgcc doesn’t like files ending in .cpp. If I switch to using pgcpp then things are even worse.

An obvious question is “is PGI name-mangling compatible with the two gnu compilers”?

Another question is, has anyone successfully compiled MAGMA with the PGI compilers? If so, would they mind sharing how they did it?

Thanks.

Malcolm

Malcolm,

I have not tried to build the MAGMA framework with PGI compilers.

I can tell you that PGI C++ is not name-mangling compatible with GNU C++ or any other C++ compiler.

Hope this helps.

Best regards,

+chris

Chris, thanks for the response. What about the PGI C and Fortran compilers - is the name-mangling for them the same/different as gcc and gfortran?

Thanks.

Malcolm

Malcolm,

For C, generally yes. However, objects compiled with PGI C may carry dependencies on the PGI runtime libraries, which will require you to link them into your application when creating an executable.

For F77, the same is also generally true, consider this example:

cparrott@galaxy ~ $ gfortran -c functest.f
cparrott@galaxy ~ $ nm functest.o
U _gfortran_st_write
U _gfortran_st_write_done
U gfortran_transfer_character
0000000000000000 T functest

cparrott@galaxy ~ $ pgf77 -c functest.f
cparrott@galaxy ~ $ nm functest.o
000000000000000a d .C1_307
0000000000000000 d .C1_310
0000000000000058 t _functest_END
U fio_ldw
U fio_ldw_end
U fio_ldw_init
U fio_src_info
0000000000000000 T functest



(Note the additional unresolved references to the GNU Fortran and PGI Fortran runtime libraries, respectively, in each of these cases.)

With F90 or later, things begin to get tricky when you try to intermix objects. Code using modules isn’t interchangeable, for example:

cparrott@galaxy ~ $ pgf90 -c triangle_operations.f90
cparrott@galaxy ~ $ nm triangle_operations.o
0000000000000000 A …Dm_triangle_operations
0000000000000000 d .C2_290
U __fss_sin_vex
U __mth_i_acos
0000000000000084 t triangle_operations_area_END
U pgf90_compiled
0000000000000000 T triangle_operations

0000000000000010 T triangle_operations_area



cparrott@galaxy ~ $ gfortran -c triangle_operations.f90
cparrott@galaxy ~ $ nm triangle_operations.o
0000000000000000 T __triangle_operations_MOD_area
U acosf
U sinf


So the bottom line is that sometimes, you can get away with it, but it’s generally a good idea to stick with one compiler as much as possible.

Hope this helps,

+chris

Chris, thanks for the thoughtful reply.

Malcolm

I seem to have managed to build MAGMA (1.5.0-beta3) with the PGI compilers (14.6). I haven’t tested the library yet (I will ASAP), but the compilation seems to have gone well enough. Use the make.inc.mkl-pgc++ below (or adapt as necessary to use something other than MKL) as make.inc:

-85-$ cat make.inc.mkl-pgc++ 
#//////////////////////////////////////////////////////////////////////////////
#   -- MAGMA (version 1.5.0-beta3) --
#      Univ. of Tennessee, Knoxville
#      Univ. of California, Berkeley
#      Univ. of Colorado, Denver
#      @date July 2014
#//////////////////////////////////////////////////////////////////////////////

# GPU_TARGET contains one or more of Tesla, Fermi, or Kepler,
# to specify for which GPUs you want to compile MAGMA:
#     Tesla  - NVIDIA compute capability 1.x cards
#     Fermi  - NVIDIA compute capability 2.x cards
#     Kepler - NVIDIA compute capability 3.x cards
# The default is all, "Tesla Fermi Kepler".
# See http://developer.nvidia.com/cuda-gpus
#
#GPU_TARGET ?= Tesla Fermi Kepler

CC        = pgc++
NVCC      = nvcc
FORT      = pgf90

ARCH      = ar
ARCHFLAGS = cr
RANLIB    = ranlib

OPTS      = -fast -O3 -DADD_ -mp -DMAGMA_WITH_MKL -DMAGMA_SETAFFINITY
F77OPTS   = -fast -O3 -DADD_
FOPTS     = -fast -O3 -DADD_
NVOPTS    = -O3 -DADD_ -Xcompiler -fno-strict-aliasing
LDOPTS    = -mp

# old MKL
#LIB       = -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_lapack -lmkl_core -lguide -lpthread -lcublas -lcudart -lstdc++ -lm

# see MKL Link Advisor at http://software.intel.com/sites/products/mkl/
# icc with MKL 10.3
LIB       = -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -lpthread -lcublas -lcudart -lstdc++ -lm -liomp5 -pgf90libs

# define library directories preferably in your environment, or here.
# for MKL run, e.g.: source /opt/intel/composerxe/mkl/bin/mklvars.sh intel64
#MKLROOT ?= /opt/intel/composerxe/mkl
#CUDADIR ?= /usr/local/cuda
-include make.check-mkl
-include make.check-cuda

LIBDIR    = -L$(MKLROOT)/lib/intel64 \
            -L$(MKLROOT)/../compiler/lib/intel64 \
            -L$(CUDADIR)/lib64

INC       = -I$(CUDADIR)/include -I$(MKLROOT)/include

Cheers,
Kyle

Hi All,

I’ve run the test cases (in the $MAGMA_ROOT/testing directory) on a K20c and all seems well. Hopefully somebody finds this useful.

Cheers,
Kyle

Hi Kyle,

Thanks so much for the follow-up. I was just testing MAGMA 1.5.0 beta3 here with PGI 14.7 yesterday, and had similar success. I used ACML instead of MKL here, though, as ACML ships with PGI.

Best regards,

+chris

Chris, good to know that compiling Magma is possible with 14.7, as I already have it.

Are there any plans to include compiled MAGMA in future releases - just as you include LAPACK and BLAS?

Malcolm

Hi Chris,

Can you post your make.inc file for building MAGMA 1.5.0 beta3 with PGI 14.7 using ACML?

Thanks!

Brandon

Sure, here it is. Note there are issues with MAGMA with PGI 14.7 that are under investigation at the moment.

Best regards,

+chris


#//////////////////////////////////////////////////////////////////////////////
#   -- MAGMA (version 1.5.0-beta3) --
#      Univ. of Tennessee, Knoxville
#      Univ. of California, Berkeley
#      Univ. of Colorado, Denver
#      @date July 2014
#//////////////////////////////////////////////////////////////////////////////

# GPU_TARGET contains one or more of Tesla, Fermi, or Kepler,
# to specify for which GPUs you want to compile MAGMA:
#     Tesla  - NVIDIA compute capability 1.x cards
#     Fermi  - NVIDIA compute capability 2.x cards
#     Kepler - NVIDIA compute capability 3.x cards
# The default is all, "Tesla Fermi Kepler".
# See http://developer.nvidia.com/cuda-gpus
#
#GPU_TARGET ?= Tesla Fermi Kepler

CC        = pgc++
NVCC      = nvcc
FORT      = pgfortran

ARCH      = ar
ARCHFLAGS = cr
RANLIB    = ranlib

OPTS      = -O2 -DADD_ -mp -DMAGMA_SETAFFINITY -DMAGMA_WITH_ACML -DCUBLAS_GFORTRAN
F77OPTS   = -O2 -DADD_
FOPTS     = -O2 -DADD_
NVOPTS    = -O3 -DADD_ -Xcompiler -fno-strict-aliasing
LDOPTS    = -mp

LIB       = -pgf90libs -lacml_mp -lcblas -lcublas -lcudart -lm

# define library directories here or in your environment
#ACMLDIR  ?= /opt/acml
#CBLASDIR ?= /opt/CBLAS
#CUDADIR  ?= /usr/local/cuda
ACMLDIR = (path to ACML)
CBLASDIR = (path to CBLAS)
CUDADIR = (path to CUDA)
-include make.check-acml
-include make.check-cuda

LIBDIR    = -L$(ACMLDIR)/pgi64_mp/lib \
            -L$(CBLASDIR)/lib \
            -L$(CUDADIR)/lib64

INC       = -I$(CUDADIR)/include

Thanks! I tried the same make.inc (setting the path to ACMLDIR, CBLASDIR, and CUDADIR) and when MAGMA tries to build its test cases I get the following error:

pgc++ -O2 -DADD_ -DMAGMA_SETAFFINITY -DMAGMA_WITH_ACML -DCUBLAS_GFORTRAN -DMIN_CUDA_ARCH=300 -I/usr/local/cuda/include -I…/include -I…/control -c testing_dgeev.cpp -o testing_dgeev.o
pgc++ -mp testing_dgeev.o -o testing_dgeev
libtest.a lin/liblapacktest.a -L…/lib -lmagma
-L/opt/pgi/linux86-64/14.7/lib -L/home/shipmanb1/CBLAS/lib -L/usr/local/cuda/lib64
-pgf90libs -lacml_mp -lcblas -lcublas -lcudart -lm
/home/shipmanb1/CBLAS/lib/libcblas.a(dznrm2sub.o): In function dznrm2sub_': /home/shipmanb1/CBLAS/src/./dznrm2sub.f:13: undefined reference to dznrm2_’
make: *** [testing_dgeev] Error 2

If I try to run any of the tests that were built then they return “Segmentation Fault”

Is this related to the issues with MAGMA with PGI 14.7 that you mention are under investigation? Is there a forum thread discussing those? It sounds like you were able to run the test cases and complete a build with ACML, but I can’t figure out what I’m doing wrong.

Any suggestions for how to build MAGMA with PGI and ACML will be greatly appreciated.

Hi,

First of all, just to ensure I had the latest and greatest ACML, I installed it separately outside of the PGI tree. I used ACML 5.3.1 for this test, set my ACML directory in the make.inc file accordingly. Not sure if this matters.

Here is the snippet of my build log for the same file:

pgc++ -O2 -DADD_ -mp -DMAGMA_SETAFFINITY -DMAGMA_WITH_ACML -DCUBLAS_GFORTRAN -DMIN_CUDA_ARCH=100 -I/home/sw/cuda/hammer-6.0/linux86-64/include -I../include -I../control -c testing_dgeev.cpp -o testing_dgeev.o


pgc++ -mp  testing_dgeev.o -o testing_dgeev \
        libtest.a lin/liblapacktest.a -L../lib -lmagma \
        -L/scratch/cparrott/acml5.3.1/pgi64_mp/lib -L/scratch/cparrott/CBLAS/lib -L/home/sw/cuda/hammer-6.0/linux86-64/lib64 \
        -pgf90libs -lacml_mp -lcblas -lcublas -lcudart -lm

Anyway, yes, there are some segmentation fault issues under investigation. I have filed a report on at least some of them, but do not have any more details at this time.

One other thing - if you are using a recent version of ACML, you will need to make a small change to the MAGMA source code. This has to do with the fact that recent versions of ACML added another parameter to the acmlversion() function, but the MAGMA source code has not been updated to reflect this. So, you will need to change the interface_cuda/interface.cpp file as follows:

At around line 22:

#if defined(MAGMA_WITH_ACML)
// header conflicts with magma's lapack prototypes, so declare function directly
// #include <acml.h>
extern "C"
void acmlversion(int *major, int *minor, int *patch);
#endif

Change this to:

#if defined(MAGMA_WITH_ACML)
// header conflicts with magma's lapack prototypes, so declare function directly
// #include <acml.h>
extern "C"
void acmlversion(int *major, int *minor, int *patch, int *build);
#endif

Then at around line 117, change this:

#if defined(MAGMA_WITH_ACML)
    int acml_major, acml_minor, acml_patch;
    acmlversion( &acml_major, &acml_minor, &acml_patch );
    printf( "ACML %d.%d.%d. ", acml_major, acml_minor, acml_patch );
#endif

To this:

#if defined(MAGMA_WITH_ACML)
    int acml_major, acml_minor, acml_patch, acml_build;
    acmlversion( &acml_major, &acml_minor, &acml_patch, &acml_build );
    printf( "ACML %d.%d.%d.%d. ", acml_major, acml_minor, acml_patch, acml_build );
#endif

Not making this change with a recent version of ACML will guarantee a segfault, as the new fourth parameter will be undefined when acmlversion() is called.

However, this does not get rid of all the segfaults. Some of the tests will run after this change, though.

Hope this helps.

Best regards,

+chris

Thanks, Chris. Your post was very helpful and I was able to get MAGMA with ACML (and PGI) to build. Unfortunately, as you point out, not all of the tests run successfully. For instance, the main one I was interested in (testing_zgetrf_f) fails with an error message that:

[shipmanb1@beehive testing]$ ./testing_zgetrf_f 
Error in magma_getdevice_arch: MAGMA not initialized (call magma_init() first) or bad device

I followed Kyle’s post to build MAGMA with MKL (and PGI) and had the same error message occur.

Since testing_zgesv did work (a C++ call to a similar MAGMA routine) I tried to use my own C++ code as a wrapper to call some CUDA-Fortran. Unfortunately that also didn’t work, but since I was able to reduce that to a smaller example that appears to be independent of MAGMA I have made a separate Forum posting about it here:

If anyone has more updates in fixing how PGI (14.7) builds MAGMA (1.5.0-beta3), please post the solution here as I’ll continue to monitor this thread.

Thanks again for your help.

Hi Brandon,

It’s just complaining that MAGMA hasn’t been initialized. This would “break” with all compilers (not just PGI). The following modifications (!!!NEW LINE!!!) to /MY_PATH_TO/magma-1.5.0-beta3/testing/testing_zgetrf_f.f90 gets rid of the error:

      real(kind=8)                  :: flops, t, tstart, tend

      PARAMETER          ( nrhs = 1, zone = 1., mzone = -1. )

      call cublas_init()
      call magmaf_init()!!!NEW LINE!!!

      n   = 2048
      lda = n

!------ Allocate CPU memory

AND

!---- Free CPU memory
      deallocate(A, A2, B, X, ipiv, work)

!---- Free GPU memory
      call magmaf_finalize()!!!NEW LINE!!!
      call cublas_shutdown()

 105  format((a35,es10.3))

      end

Cheers,
Kyle

Thanks, Kyle! That fix worked for me to enable testing_zgetrf_f to build and run.

Thanks, Kyle. I will make similar changes in my MAGMA test tree here at PGI, too. It looks like most of the C/C++ tests in MAGMA already do this via macros in the testings.h header file.

My suspicion is that MAGMA, still being a beta code, has a few bugs itself like this. That makes it a bit more of challenge to sort out what is a compiler bug, versus what is a bug in the code itself. Having said that, I am aware of at least one issue with testing_zunmqr_gpu.cpp with PGI, and this issue may show up in other tests as well. Our developer is looking at it right now, and hopefully things will run better with PGI once this issue is fixed.

Best regards,

+chris

Hi Chris. Any idea when we will see a “good” combo of PGI Fortran and MAGMA?

Thanks.

Malcolm

Malcolm,

Well, we only control half of that equation here at PGI currently… :-)

I assure you it is on our radar, and we hope to have MAGMA fully working in the near future.

Best regards,

+chris

Chris, good to know that the issues with MAGMA and PGI will be resolved soon. If the release is pre-packaged like lapack and blas that will be great. Otherwise,please make a Makefile available at the same time!

Thanks

Malcolm