How does clCreateBuffer actually work? We don't supply a cl_device_id

clCreateBuffer takes a context as a parameter, not a device. A context can be created for a group of devices.

So, in which device is the buffer allocated? Ex. if my context contains two GPUs.

In every one in context.

That’s odd.

Is there a speed difference between creating a buffer for two GPUs that share a context and two buffers, each in its own context (1 context per 1 GPU)?