dynamic parallelism: undefined reference to `__fatbinwrap_38_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37'

Hi all,

my cuda code doesn’t compile:
The Makefile looks like:

CC=mpic++

# C++ FLAG COMPILER
CFLAGS= -g -std=c++11 -O2 -DENABLE_CUDA

NVCC=nvcc
CUDAFLAGS= -arch=sm_61 -std=c++11 -rdc=true -c

# LIBS
LIBS= -lm -lc
CUDALIBS= -L/usr/local/cuda/lib64/ -lcudadevrt -lcudart
INCDIRS= -I/usr/local/cuda/include/

#... definition of SRC and CUDA_SRC

# LIST OF OBJECTS TO COMPILE
OBJ        =   $(patsubst %.cpp, %.o, $(SRC))
CUDA_OBJ        =   $(patsubst %.cu, %.o, $(CUDA_SRC))

# EXECUTABLE
prog_modeling.x : $(OBJ) $(CUDA_OBJ) dlink.o prog_modeling.o
	$(CC) $(CFLAGS) $(INC) $(LIBS) -o $@ $^

prog_modeling.o : prog_modeling.cpp
	$(CC) $(CFLAGS) $(INC) -c $< -o $@

dlink.o : $(CUDA_OBJ)
	nvcc -arch=sm_61 -dlink -o $@ $^ $(CUDALIBS)

# OBJECTS
%.o : %.cpp
	$(CC) $(CFLAGS) $(INC) -c $< -o $@

%.o : %.cu
	$(NVCC) $(CUDAFLAGS) $(INC) $< -o $@

default: prog_modeling.x

But at the end I receive:

dlink.o: In function `__cudaRegisterLinkedBinary_38_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37':
link.stub:(.text+0x280): undefined reference to `__fatbinwrap_38_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37'

.

I think that it means that I didn’t link against cudadevrt, but I did?!
Do you see the problem?

Thank you!

I don’t see CUDALIBS or LIBS defined anywhere in what you have shown. But my guess is you have not provided the proper libraries at the final link phase (which you are doing with CC).

Anyway, there’s not enough information here to sort this out. THe full makefile, plus the actual commands issued, at a minimum, would be needed to make a definitive statement.

The makefile produces the following output:

mpic++ -g -std=c++11 -O2 -DENABLE_CUDA -I./src -I./src/io -c src/file_c1.cpp -o src/file_c1.o
mpic++ -g -std=c++11 -O2 -DENABLE_CUDA -I./src -I./src/io -c src/file_c2.cpp -o src/file_c2.o

nvcc -arch=sm_61 -std=c++11 -I./src -I./src/io  -rdc=true -c src/file1.cu -o src/file1.o
nvcc -arch=sm_61 -std=c++11 -I./src -I./src/io  -rdc=true -c src/file2.cu -o src/file2.o

nvcc -arch=sm_61 -std=c++11 -dlink -o dlink.o src/file1.o src/file2.o


mpic++ -g -std=c++11 -O2 -DENABLE_CUDA -I./src -I./src/io -c src/prog_modeling.cpp -o src/prog_modeling.o

mpic++ -g -std=c++11 -O2 -DENABLE_CUDA -I./src -I./src/io  -lm -lc -L/usr/local/cuda/lib64/ -lcudadevrt -lcudart -o prog_modeling.x src/file_c1.o src/file_c2.0 src/file1.o src/file2.o dlink.o prog_modeling.o

First of all, I don’t really believe that the makefile you have shown produces that output.

The final link phase command you show:

mpic++ -g -std=c++11 -O2 -DENABLE_CUDA -I./src -I./src/io  -lm -lc -L/usr/local/cuda/lib64/ -lcudadevrt -lcudart -o prog_modeling.x src/file_c1.o src/file_c2.0 src/file1.o src/file2.o dlink.o prog_modeling.o

is created from this makefile content:

prog_modeling.x : $(OBJ) $(CUDA_OBJ) dlink.o prog_modeling.o
	$(CC) $(CFLAGS) $(INC) $(LIBS) -o $@ $^

and that makefile content doesn’t include CUDALIBS, yet the CUDALIBS is showing up in your actual command.

Anyway, in your final link phase move these switches:

-L/usr/local/cuda/lib64/ -lcudadevrt -lcudart

to the very end of that command line.

Yes, sorry. I was playing arround with the Makefile, so I added $CUDALIBS and removed it again (because it doesn’t work at this place)

However, it works if I put it at the end of the line.
Thank you very much!

UPDATE: this problem was solved by changing the cuda version from 11.1 to 10.2.
I met the same problem in a new machine. My make file works fine in Ubuntu 16.04 and 18.04 both. But in a new machine recently installed with ubuntu 18.04, I got the same unreferenced function error. Could you help me out?
Thanks
Ze
Below is my Makefile:

CC=g++
CFLAGS=-O3 -fopenmp -DVM61 #-VM70 and VM75 will have 96Kb and 64Kb shared memory
#CFLAGS= -g -Wall -pedantic # debugging C-Compiler flags
INCDIRS=-I/usr/local/cuda/include
LD=g++ # used linker
LFLAGS=-O3 -lstdc++ -std=c++11 -fopenmp -lpthread # linker flags
LIBS=-lm -lz -L/usr/local/cuda/lib64 # used libraries

gpumsubengap_dp: sampen.o nsampen.o math.o nifti_io.o znzlib.o fslio.o
$(LD) $(LFLAGS) -L/usr/local/cuda/lib64 -lcuda sampen.o nsampen.o math.o nifti_io.o znzlib.o fslio.o -o gpumsubengap_dp $(LIBS) -lcudadevrt -lcudart

sampen.o: mssampen_dp.cu entropy.h nifti1_io.h znzlib.h fslio.h dbh.h nifti1.h mathfunc.h
#$(CC) -c $(CFLAGS) mssampen.cu -o mven.o
nvcc -std=c++11 -arch=sm_61 -rdc=true -c mssampen_dp.cu -o sampen.o
nsampen.o: sampen.o math.o nifti_io.o znzlib.o fslio.o
nvcc -arch=sm_70 -dlink -o nsampen.o sampen.o math.o nifti_io.o znzlib.o fslio.o

math.o: mathfunc.cpp mathfunc.h entropy.h
$(CC) -c $(CFLAGS) mathfunc.cpp -o math.o

nifti_io.o: nifti1_io.c nifti1_io.h nifti1.h
$(CC) -c $(CFLAGS) nifti1_io.c -o nifti_io.o

znzlib.o: znzlib.c znzlib.h
$(CC) -c $(CFLAGS) znzlib.c -o znzlib.o

fslio.o: fslio.c fslio.h
$(CC) -c $(CFLAGS) fslio.c -o fslio.o

In the future, please use the code formatting tools available in the toolbar at the top of your edit window, to make your makefile more readable.

That makefile looks broken to me. I’m surprised it works at all.

nvcc -std=c++11 -arch=sm_61 -rdc=true -c mssampen_dp.cu -o sampen.o
nsampen.o: sampen.o math.o nifti_io.o znzlib.o fslio.o
nvcc -arch=sm_70 -dlink -o nsampen.o sampen.o math.o nifti_io.o znzlib.o fslio.o

why do you target two different architectures?