Does anyone know what might cause the following error:
error: identifier “__eh_curr_region” is undefined
I know this sounds very general, but I have a lot of code and up until now it was compiling fine. But now I get this error everytime I run the makefile.
Any suggestions are appreciated
-Andrey
I narrowed it down and mine is caused by the exact same thing:
shared int arr;
I think this is happening because SIZE is either (a) defined in another file and kernel can’t “see it” or (b) the variable is not declared as constant. Anymore ideas?
The error occurs at compile time, not runtime so I don’t think it will matter whether or not the shared memory is properly allocated. The problem is with the declaration of the array. I see you’re suggesting declaring the array as extern, could you explain why? And another thing i think i should mention (since i don’t know if it matters or not) my kernel is a device function.
I figured out what causes the error. When declaring an array in shared memory, the size must be specified by a constant variable or an integer:
__constant__ int SIZE = 16;
__global__ void func( ... )
{
 __shared__ float arrayOfFloats;
 //or like this:
 __shared__ float anotherArrayOfFloats[8];
}
The solution to this problem is discussed in Section 4.2.2.3 of the CUDA Programming Guide. It describes a method of dynamic allocating shared memory to an array (which is what johsimr was alluding to):
Lastly, don’t forget to specify the appropriate number of bytes dedicated to shared memory in your execution configuration of this kernel (as described in Section 4.2.3).
Hope my explanation helps to anyone with this problem
So, is there no way to dynamically specify the memory to be allocated? I’m not sure (until runtime) how much shared memory I will be requiring. Should I just assume the worst case and allocate that much memory? Is such an allocation an expensive operation?
Is there a way to use a variable BLOCK_SIZE? I want to read one from a file in my host code and pass it to the kernel call. The compiler gives me the __eh_curr_region error when I try to do this, even if it is passed as a const int. The allocation of shared variables depends on this BLOCK_SIZE.
Thank you, that is useful for me because now i know how shared memory limits the block_size. Is there a maximum number of blocks you can use when you call your kernel? I assume that if there aren’t enought free multiprocessors they will run a block after another. But maybe i’m not right.
The extern keyword allows you to declare your array without a size. If you do not use extern, you must specify a size, and that size has to be constant to avoid the __eh_curr_region error.
The reason you want to not specify a size in the kernel is that you want the allocation to be dynamic. The size of the dynamic array is specified at launch time [font=“Arial Black”]Func<<< Dg, Db, Ns >>>(…);[/font] where Ns is the size of the dynamic array.
Read the rather terse sections 4.2.2.3 and 4.2.3 of the CUDA Programming Guide for details (including how to allocate multiple dynamic arrays). I had to read it a few times before it suddenly made sense. :wacko: