Hi! I’m looking to write something of the form:
//Nvidia specific code here
//Code for other devices
I know that at very least Apple and AMD have vendor #defines that would let me do this, but I need to be able to do this for Nvidia devices.
The code segment in question is an atomic vector store and corresponding atomic vector load. On most devices, a simple vstore or vload is atomic for float4, but Nvidia is an exception, and requires coalescing across blocks of threads, hence special case code.
So, what’s the magical define to detect Nvidia?
I’m also interested in this, however if you know that AMD and Apple have one, then in the meantime you could do:
That way it will chose the appropriate vendor code for you. But if someone knows the more elegant way of doing this (that does not assume the existance of 3 OCL implementations), then I would also like to know.
On Windows, searching nvcompiler.dll for plain text strings reveals some interesting defines that can be used to write conditional code. In my case, I’m using “#ifdef __NV_CL_C_VERSION” to check for NVIDIA’s OpenCL implementation (and “#ifdef _AMD_OPENCL” to check for AMD’s).