I’ve found that it is fairly unclear exactly how large a portion of the shared memory one can allocate. I’ve had some programs seems to get away with up to 3970 floats, and others seem to fail with significantly less. (Note that the failure is silent in release mode and the emulation modes work completely fine) Only in debug mode does the very helpful error, “unknown error” occur. I’m nearly 100% certain it is due to trying to allocate too much shared memory.
Is this because the registers for each fragment are also coming from the shared memory as well? So the more registers needed for the kernel (and/or number of threads) the less shared memory available?
Would it be possible to give us some way of knowing how much shared memory we can allocate?
Or give us a more useful error message than “unknown error” when a kernel fails for this reason?
Or make the emulation modes do some sort of check and warn you when this failure would probably happen when running on the device?
PS sorry about the multiple posts, the forums were acting a little wacky. Feel free to delete the first two.