Nvlink error CUDA 11.8, GNU 10.4.0

Hello,

I am getting nvlink errors where a function appears to have “different sizes” :

make -f coreComponents/CMakeFiles/geosx_core.dir/build.make coreComponents/CMakeFiles/geosx_core.dir/build
make[2]: Entering directory /dev/shm/mtml/src/GEOS/GEOS/build-GPU-Hypre-GCC-CUDA_11.8-ompi_hpcx-OMP-relwithdebinfo' [ 56%] Linking CUDA device code CMakeFiles/geosx_core.dir/cmake_device_link.o cd /dev/shm/mtml/src/GEOS/GEOS/build-GPU-Hypre-GCC-CUDA_11.8-ompi_hpcx-OMP-relwithdebinfo/coreComponents && /data/saet/mtml/software/x86_64/cmake-3.24.1-linux-x86_64/bin/cmake -E cmake_link_script CMakeFiles/geosx_core.dir/dlink.txt --verbose=1 /usr/local/cuda-11.8/bin/nvcc -forward-unknown-to-host-compiler -ccbin=/data/saet/mtml/software/x86_64/RHEL7/hpcx-v2.14-gcc-MLNX_OFED_LINUX-5-redhat7-cuda11-gdrcopy2-nccl2.16-x86_64/ompi/bin/mpic++ -restrict -arch sm_80 --expt-extended-lambda --expt-relaxed-constexpr -Werror cross-execution-space-call,reorder,deprecated-declarations -g -lineinfo -restrict -arch sm_80 --expt-extended-lambda --expt-relaxed-constexpr -Werror cross-execution-space-call,reorder,deprecated-declarations -O3 -DNDEBUG -Xcompiler -DNDEBUG -Xcompiler -Ofast --generate-code=arch=compute_80,code=[compute_80,sm_80] -Xcompiler=-fopenmp -Xcompiler=-L/usr/local/cuda-11.8/lib64 -Xlinker=-rpath -Xlinker=/data/saet/mtml/software/x86_64/RHEL7/hpcx-v2.14-gcc-MLNX_OFED_LINUX-5-redhat7-cuda11-gdrcopy2-nccl2.16-x86_64/ompi/lib -Xlinker=--enable-new-dtags -Xcompiler=-pthread -Xcompiler=-fPIC -Wno-deprecated-gpu-targets -shared -dlink CMakeFiles/geosx_core.dir/__/cmake/blt/tests/internal/src/combine_static_library_test/dummy.cpp.o -o CMakeFiles/geosx_core.dir/cmake_device_link.o -L/usr/local/cuda-11.8/targets/x86_64-linux/lib/stubs -L/usr/local/cuda-11.8/targets/x86_64-linux/lib ../lib/libcommon.a ../lib/libcodingUtilities.a ../lib/libdataRepository.a ../lib/libschema.a ../lib/libfunctions.a ../lib/libconstitutive.a ../lib/libmesh.a ../lib/libdenseLinearAlgebra.a ../lib/liblinearAlgebra.a ../lib/libfieldSpecification.a ../lib/libfiniteElement.a ../lib/libfiniteVolume.a ../lib/libdiscretizationMethods.a ../lib/libfileIO.a ../lib/libphysicsSolvers.a ../lib/libevents.a ../lib/libmainInterface.a /data/saet/mtml/software/x86_64/RHEL7/GEOSTPL/0.2.0/install-GPU-Hypre-GCC-CUDA_11.8-ompi_hpcx-OMP-relwithdebinfo/pugixml/lib64/libpugixml.a /data/saet/mtml/software/x86_64/RHEL7/GEOSTPL/0.2.0/install-GPU-Hypre-GCC-CUDA_11.8-ompi_hpcx-OMP-relwithdebinfo/mathpresso/lib/libmathpresso.a /data/saet/mtml/software/x86_64/RHEL7/GEOSTPL/0.2.0/install-GPU-Hypre-GCC-CUDA_11.8-ompi_hpcx-OMP-relwithdebinfo/parmetis/lib/libparmetis.a /data/saet/mtml/software/x86_64/RHEL7/GEOSTPL/0.2.0/install-GPU-Hypre-GCC-CUDA_11.8-ompi_hpcx-OMP-relwithdebinfo/metis/lib/libmetis.a /data/saet/mtml/software/x86_64/RHEL7/GEOSTPL/0.2.0/install-GPU-Hypre-GCC-CUDA_11.8-ompi_hpcx-OMP-relwithdebinfo/scotch/lib/libptscotch.a /data/saet/mtml/software/x86_64/RHEL7/GEOSTPL/0.2.0/install-GPU-Hypre-GCC-CUDA_11.8-ompi_hpcx-OMP-relwithdebinfo/scotch/lib/libptscotcherr.a /data/saet/mtml/software/x86_64/RHEL7/GEOSTPL/0.2.0/install-GPU-Hypre-GCC-CUDA_11.8-ompi_hpcx-OMP-relwithdebinfo/scotch/lib/libscotch.a /data/saet/mtml/software/x86_64/RHEL7/GEOSTPL/0.2.0/install-GPU-Hypre-GCC-CUDA_11.8-ompi_hpcx-OMP-relwithdebinfo/scotch/lib/libscotcherr.a /data/saet/mtml/software/x86_64/RHEL7/GEOSTPL/0.2.0/install-GPU-Hypre-GCC-CUDA_11.8-ompi_hpcx-OMP-relwithdebinfo/hypre/lib/libHYPRE.a /data/saet/mtml/software/x86_64/RHEL7/GEOSTPL/0.2.0/install-GPU-Hypre-GCC-CUDA_11.8-ompi_hpcx-OMP-relwithdebinfo/silo/lib/libsiloh5.a ../lib/libPVTPackage.a ../lib/libhdf5_interface.a ../lib/liblvarray.a /data/saet/mtml/software/x86_64/RHEL7/GEOSTPL/0.2.0/install-GPU-Hypre-GCC-CUDA_11.8-ompi_hpcx-OMP-relwithdebinfo/chai/lib/libchai.a /data/saet/mtml/software/x86_64/RHEL7/GEOSTPL/0.2.0/install-GPU-Hypre-GCC-CUDA_11.8-ompi_hpcx-OMP-relwithdebinfo/raja/lib/libRAJA.a /data/saet/mtml/software/x86_64/RHEL7/GEOSTPL/0.2.0/install-GPU-Hypre-GCC-CUDA_11.8-ompi_hpcx-OMP-relwithdebinfo/chai/lib/libumpire.a /data/saet/mtml/software/x86_64/RHEL7/GEOSTPL/0.2.0/install-GPU-Hypre-GCC-CUDA_11.8-ompi_hpcx-OMP-relwithdebinfo/raja/lib/libcamp.a /data/saet/mtml/software/x86_64/RHEL7/GEOSTPL/0.2.0/install-GPU-Hypre-GCC-CUDA_11.8-ompi_hpcx-OMP-relwithdebinfo/conduit/lib/libconduit_relay.a -lrt -lz -ldl -lm /data/saet/mtml/software/x86_64/RHEL7/GEOSTPL/0.2.0/install-GPU-Hypre-GCC-CUDA_11.8-ompi_hpcx-OMP-relwithdebinfo/conduit/lib/libconduit_blueprint.a /data/saet/mtml/software/x86_64/RHEL7/GEOSTPL/0.2.0/install-GPU-Hypre-GCC-CUDA_11.8-ompi_hpcx-OMP-relwithdebinfo/conduit/lib/libconduit.a /data/saet/mtml/software/x86_64/RHEL7/GEOSTPL/0.2.0/install-GPU-Hypre-GCC-CUDA_11.8-ompi_hpcx-OMP-relwithdebinfo/fmt/lib64/libfmt.a /data/saet/mtml/software/x86_64/RHEL7/GEOSTPL/0.2.0/install-GPU-Hypre-GCC-CUDA_11.8-ompi_hpcx-OMP-relwithdebinfo/adiak/lib/libadiak.a /usr/local/cuda-11.8/lib64/libcudart_static.a -lpthread -lcudadevrt -lcudart_static -lmpi nvlink error : Size doesn't match for '_ZN4geos13finiteElement18ImplicitKernelBaseINS_20CellElementSubRegionENS_12constitutive11PorousSolidINS3_16ElasticIsotropicEEENS0_38H1_Hexahedron_Lagrange1_GaussLegendre2ELi3ELi3EE14StackVariablesC1Ev$582' in '../lib/libphysicsSolvers.a:PoromechanicsEFEMKernels_CellElementSubRegion_PorousSolid-ElasticIsotropic-_H1_Hexahedron_Lagrange1_GaussLegendre2.cpp.o', first specified in '../lib/libphysicsSolvers.a:SolidMechanicsFixedStressThermoPoroElasticKernels_CellElementSubRegion_PorousSolid-ElasticIsotropic-_H1_Hexahedron_Lagrange1_GaussLegendre2.cpp.o' (target: sm_80) nvlink fatal : merge_elf failed (target: sm_80) make[2]: *** [coreComponents/CMakeFiles/geosx_core.dir/cmake_device_link.o] Error 1 make[2]: Target coreComponents/CMakeFiles/geosx_core.dir/build’ not remade because of errors.
make[2]: Leaving directory `/dev/shm/mtml/src/GEOS/GEOS/build-GPU-Hypre-GCC-CUDA_11.8-ompi_hpcx-OMP-relwithdebinfo’
make[1]: *** [coreComponents/CMakeFiles/geosx_core.dir/all] Error 2

Any suggestions?

Thanks!
Michael

Hi Michael,

Where I’ve seen this before when using OpenACC is when one object changes but another that references it didn’t get recompiled so has an older definition. Hence, I’d first try a clean rebuild and see if that fixes the issue.

Note that I support the NVHPC compilers (nvc, nvc++, nvfortran) so not as well versed in nvcc issues. Hence if the rebuild doesn’t solve the issue, we can move your post over the nvcc forum to see if they have other suggestions.

-Mat

1 Like

Hi, Mat,

inconsistency between definition(s) vs use was my suspicion too. I need to demangle these ugly-long identifier names as I cannot even find the source searching in Github.

Any hints as to how to demangle or how to recover the source file name out of the mangled identifier?

Thanks!
Michael

Any hints as to how to demangle or how to recover the source file name out of the mangled identifier?

Try using “c++filt” which demangles. Sometimes extra characters get added, such as here with the “$582”.

% c++filt
_ZN4geos13finiteElement18ImplicitKernelBaseINS_20CellElementSubRegionENS_12constitutive11PorousSolidINS3_16ElasticIsotropicEEENS0_38H1_Hexahedron_Lagrange1_GaussLegendre2ELi3ELi3EE14StackVariablesC1Ev
geos::finiteElement::ImplicitKernelBase<geos::CellElementSubRegion, geos::constitutive::PorousSolid<geos::constitutive::ElasticIsotropic>, geos::finiteElement::H1_Hexahedron_Lagrange1_GaussLegendre2, 3, 3>::StackVariables::StackVariables()
1 Like

Thanks Mat!