memory overflow?

Hi All,

I have a Quadro FX 3800 card, and am working on very large data.
In my application, I use block size of (4,4,32) which is the limit of max threads/block. The grid size is (16,16,1) as a result.
I allocated 2 arrays on the device which occupy 3 MB of global memory.
In the caller function, I pass 7 structures to the kernel, each structure is 12 bytes. I pass the two big arrays as parameters too.
When I run the application, I got a small part of correct results and a bunch of crazy numbers.
I am guessing that I may be using too much memory on the device, but couldn’t figure it out.

Anyone has any idea?


Sorry, it was my fault. There is a small mistake in my code…

But I still have a question, can I use all global memory in the kernel if necessary? Should I consider leave a portion for storing instructions?