I use multithread and multiGPU.
So, as shown in the manual, I must extern the constant memory when the program loaded. This result that i can only use constant memory in device #0. If I define the constant memory in each thread, the definition will occured before cudasetdevice. And all the thread will share it( of course, it must be failed). That is what i don’t want.
Am I right?
Who know how can I allocate non-shared constant memory among threads dynamically in each thread?