I am trying to put all a CFD simulation onto a C870 by declaring all the arrays to be used in the functions which are declared as device except for the kernel called from the host. In the kernel a number of arrays are declared which are passed down and processed by a sequence of functions. When I compiled the code I got the following error message ;
“entry function uses too much local memory”
I assume that when a function, either kernel or device , allocates memory without any location specified then the default location used is the small local memory, not the global, whereas if the same arrays are declared and created from the host then all that data has the default location in global memory, and it is then up to me to move frequently used data into shared .
- why do I get that error message?
- how can I create the arrays declared in the kernel and device functions so that I don’t get the error message?
- should I simply allocate and deallocate all device memory from the host?