Hi all,
I wrote a device function to sort some input data and I had to use __syncthreads() to make it work properly. And it did work. Later I tried to change the implementation into template function. I need too make it the sorted type independent. Unfortunately after turning into template I get the compilation error as follows:
error: there are no arguments to ‘__syncthreads’ that depend on a template parameter, so a declaration of ‘__syncthreads’ must be available
Thank you all for your fast response. I did some tests with your simplified use case and I was able to reproduce the error. Which led me to suspicion that it is caused by my placing the template into header file as it is usual in c++. It doesn’t seem to be problem by itself but the compilation error may be relate to mixture of *.cu and *.cpp files in our project and wrong include of “cuda related headers” into *.cpp file. I’ll keep on investigating and let you know my findings…
I can confirm my suspicion in the previous post that the troubles I faced were caused by wrong include of headers, containing some cuda extensions such as device function specifier, into *.cpp files. The problem is solved.