I want to set two dynamic arrays in my global function,
like this:
extern shared T hen;
T* top_array = (T*)(hen+(k * blockDim.x));
But nvcc returns an error(cudaMath.cu(142): error: declaration is incompatible with previous “hen” (declared at line 142)
extern attribute((shared)) T hen;
^
detected during:
instantiation of “void topk_kernel(T *, T *, int, size_t) [with T=float]” at line 218
instantiation of “T cudaMath::topk(T *, int, size_t) [with T=float]” at line 234).
I use a third-party library to solve this problem.
Also, I saw an explanation that CUDA doesn’t immediately ‘support’ dynamically allocated shared memory arrays in templated functions, but this blog was posted ten years ago. If CUDA support that now?
The error there is due to the usage of kat::which nvcc, CUDA (and godbolt) doesn’t know anything about. And you did not show any reference to kat:: in your original posting in this thread.
If you are using items from this library, you are using something that is not provided or supported by NVIDIA, and you are asking for help with it in the wrong place.
Yes, I use GitHub - eyalroz/cuda-kat: CUDA kernel author's tools, but I think there are some misunderstandings.
kat:: is the third-party library I mentioned above on Jun 7, this library can solve that error on my NVCC.
If I do not use kat:: library and then use lines 47-48 which you think is correct, ‘error: declaration is incompatible’ would appear.
You can’t use kat without including some header files. Your godbolt example has no includes, so its not surprising that trying to use kat:: in that setting creates problems. And with respect to godbolt, I don’t think simply including kat will work, because godbolt doesn’t know where to get it.
I already gave you a godbolt example that shows that it is correct, i.e. it will compile. You have not given an example that doesn’t depend on kat and shows the error you suggest:
If you want help, show a complete example that doesn’t depend on kat, and that gives that error. You haven’t done that yet.