__constant__ data and multi-gpu

When declaring a global constant array, where does that array reside? I need the same constant data to be present on two gpus, but I am not sure how to accomplish this (using the runtime api).

I would think that an ideal situation would be that delcaring the global constant array would allocate space on both cards, and with the device set correctly for multiple threads, a call to cudaMemcpyToSymbol within each thread would load the data for each. Is this how the system works?

Constant data resides in device memory. When using multiple GPUs, you need to explicitly copy the data to both devices (using cudaMemcpyToSymbol).

I have exactly the same question as you did. Have you got the answer yet? Is it the “ideal situation” as you said or something else?

The answer was already given by Simon Green.

A single constant declaration will instantiate (i.e. allocate) space on each GPU in your system.

Data must be individually loaded to each device, however:

cudaSetDevice(0);
cudaMemcpyToSymbol(...);
cudaSetDevice(1);
cudaMemcpyToSymbol(...);

etc.

Got it. Thanks a lot.