Using global cache with OpenCL


I have NVIDIA’s GeForce 9400 GT.
I ran the code:
sizeof(global_cache_size), &global_cache_size, NULL);

global_cache_size = 0.
Does it make sense ?

In case the card has cache (e.g 1MB), how can I use it to get better performance ?

Is the cache always part of the global memory ?

If a core reads data from global memory, I guess this memory is in global cache.

But there are also many other cores working together.

How can they share the same global cache ?

In a regular CPU, when the CPU reads data, it is copied to L2 and to L1 which is usually 32KB.
L1 is in the CPU’s chip and sometimes L2 is also inside the CPU’s chip.

Where does the global cache located ?