I have faced a problem when using GeForce GTX 580 3.0GB card.
CPU: Intel Corei7 870
OS: Windows7 32bit
GPU: ELSA GeForce GTX580 (1.5GB & 3.0GB)
DevTool: Microsoft Visual Studio 2008 SP2, nVIDIA Parallel Nsight 1.51
CUDA SDK: CUDA SDK 3.2.16, CUDA Toolkit 3.2.16
CUDA runtime: cudart32_32_16.dll(Version 3.2.16)
When the Kernel function is called with a 3.0G card, it is not executed as ‘cudaErrorMemoryAllocation’. In case that data size or thread size executed at the same time is changed, the problem is not solved. However, the operation do not have such a problem with a 1.5GB [font=“Verdana”]card.
In the code where the problem occurs, around 7000KB of local memories is used in the kernel function.
In case that this memory area is referred in the kernel function after it is allocated on the device memory beforehand, the operation is normally executed with a 3.0GB card.
1)The difference between 1.5GB and 3.0GB card seems to be only the amount of memory. Why are the operation different?
2)In the method of allocating local memory in the kernel function, is there any measure to avoid ‘cudaErrorMemoryAllocation’ with a 3.0GB card?
Thank you for your help,