Error in clEnqueueNDRangeKernel()

I experienced an error of -40 when executed clEnqueueNDRangeKernel() with vector size = 20,000,000. How can I find the maximum number of kernels that can be executed simultaneously in my Nvidia GeForce 9200M GS?

Note: This same program runs fine when the vector size is 10,000,000.

Does anyone have the answer?
Thank you,

Sorry, everyone. I am learning Nvidia’s sample codes. The previous question raised when I increasing the vector size to 20,000,000 in program “oclVectorAdd”. Looks like it’s the limitation of OpenCL. Can anyone confirm?

Thank you,