I am trying to call a kernel inside another kernel. The two kernels are in the same .cu file and iterate over two different input data, like this:
global
void kernel1(float *A, float *B, float *C, uint dimx, uint dimy, uint dimz)
{
uint ix = blockIdx.x * blockDim.x + threadIdx.x;
uint iy = blockIdx.y * blockDim.y + threadIdx.y;
uint iz = blockIdx.z * blockDim.z + threadIdx.z;
const dim3 block_size(BLOCK_WIDTH, BLOCK_HEIGHT, BLOCK_DEPTH);
const dim3 num_blocks(ceil(float(dimx) / float(block_size.x)),
ceil(float(dimy) / float(block_size.y)),
ceil(float(dimz) / float(block_size.z)));
kernel2<<<num_blocks, block_size>>>(B, C[iz]);
....operations on A and B
}
global
void kernel2(float *B, float C)
{
…
}
the two kernels need to be nested because the second kernel manipulates vector B based on the z coordinate of C.
I am implementing this code with CUDA 12.1 using Qt as a compiler.
I read some documentation on Dynamic Parallelism and I inlcuded in my nvcc options the -rdc=true.
Up to here everying seems to work and this pro compiles.
Unortunately, this project is part of a bigger project, and the overall project doesn’t compile, giving the error message:
error LNK2019: unresolved external symbol ___cudaRegisterLinkedBinary_3d14d575_17_cuda_libraries_cu_77509383 referenced in function “void __cdecl __sti____cudaRegisterAll(void)” (?__sti____cudaRegisterAll@@YAXXZ)
Could you please help me to fix this issue?
In the main pro I included the following libs: -lcudart -lcuda -lcufft -lcudadevrt