Unresolved external when using cooperative groups.

I tried to add some functionality using cooperative groups, but ran into:

ptxas fatal   : Unresolved extern function 'cudaCGGetIntrinsicHandle'

The error happens even in a stripped down kernel.

#include <cuda_runtime.h>
#include <cooperative_groups.h>

__global__ void
kernel() {
    cooperative_groups::grid_group g = cooperative_groups::this_grid();
    g.sync();
}

The compile command:

/usr/local/cuda/bin/nvcc  -ccbin g++  -m 64  -c   --verbose -x cu -gencode arch=compute_70,code=sm_70 -gencode arch=compute_70,code=compute_70  --ptxas-options=-v  -o  a.o a.cu

Removing -gencode arch=compute_70,code=sm_70 from the compile command makes the error go away but (a) I’m not sure that’s the right thing to do, and (b) the threadFenceReductionSample has that gencode setting and compiles properly.

It’s unclear why this is happening and what the proper fix is.

(Environment: CUDA 9.1.85, gcc 4.8, Centos 7)

Thanks.

add

-rdc=true

read the manual:

http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#grid-synchronization-cg

or study a CG sample code project that uses grid_group

That works. Thanks.

The threadFenceReduction sample doesn’t use this option, yet compiles anyway. I’m curious how that can work.

The build log from gmake in threadFenceReduction/

/usr/local/cuda/bin/nvcc -ccbin g++ -I../../common/inc  -m64    -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_70,code=compute_70 -o threadFenceReduction.o -c threadFenceReduction.cu
/usr/local/cuda/bin/nvcc -ccbin g++   -m64      -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_70,code=compute_70 -o threadFenceReduction threadFenceReduction.o
mkdir -p ../../bin/x86_64/linux/release
cp threadFenceReduction ../../bin/x86_64/linux/release

The threadFenceReduction sample code doesn’t use grid_group

Please read the manual.