dlevi
March 23, 2018, 5:26pm
1
I tried to add some functionality using cooperative groups, but ran into:
ptxas fatal : Unresolved extern function 'cudaCGGetIntrinsicHandle'
The error happens even in a stripped down kernel.
#include <cuda_runtime.h>
#include <cooperative_groups.h>
__global__ void
kernel() {
cooperative_groups::grid_group g = cooperative_groups::this_grid();
g.sync();
}
The compile command:
/usr/local/cuda/bin/nvcc -ccbin g++ -m 64 -c --verbose -x cu -gencode arch=compute_70,code=sm_70 -gencode arch=compute_70,code=compute_70 --ptxas-options=-v -o a.o a.cu
Removing -gencode arch=compute_70,code=sm_70 from the compile command makes the error go away but (a) I’m not sure that’s the right thing to do, and (b) the threadFenceReductionSample has that gencode setting and compiles properly.
It’s unclear why this is happening and what the proper fix is.
(Environment: CUDA 9.1.85, gcc 4.8, Centos 7)
Thanks.
add
-rdc=true
read the manual:
[url]Programming Guide :: CUDA Toolkit Documentation
or study a CG sample code project that uses grid_group
dlevi
March 23, 2018, 5:53pm
3
That works. Thanks.
The threadFenceReduction sample doesn’t use this option, yet compiles anyway. I’m curious how that can work.
The build log from gmake in threadFenceReduction/
/usr/local/cuda/bin/nvcc -ccbin g++ -I../../common/inc -m64 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_70,code=compute_70 -o threadFenceReduction.o -c threadFenceReduction.cu
/usr/local/cuda/bin/nvcc -ccbin g++ -m64 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_70,code=compute_70 -o threadFenceReduction threadFenceReduction.o
mkdir -p ../../bin/x86_64/linux/release
cp threadFenceReduction ../../bin/x86_64/linux/release
The threadFenceReduction sample code doesn’t use grid_group
Please read the manual.