clGetDeviceInfo() returns an unexpected reply for parameter CL_DEVICE_MAX_COMPUTE_UNITS

Hi,

When querying my device for the CL_DEVICE_MAX_COMPUTE_UNITS, I got what I find to be a strange reply. The returned value was in terms of SMs rather than actual CUDA cores. In my device (GT8600) there’re 4 SMs each built from 8 CUDA cores. I expected to get 32, but the reply was 4.

Is this the expected behavior? Can someone confirm?

Thanks,
– Liad Weinberger.

That should be fine. Per definition, a compute unit manages a single work-group and that matches an SM in NVidia’s architecture. __local memory is shared among all work-items of a work-group and that matches an NVidia SM as well.

That should be fine. Per definition, a compute unit manages a single work-group and that matches an SM in NVidia’s architecture. __local memory is shared among all work-items of a work-group and that matches an NVidia SM as well.

Got it! Thank you for the confirmation.

Got it! Thank you for the confirmation.