Are there any differences in coalescing memory between C runtime for CUDA and OpenCL API? I am reading documentation, but I can’t see any.
Thanks in advance.
Are there any differences in coalescing memory between C runtime for CUDA and OpenCL API? I am reading documentation, but I can’t see any.
Thanks in advance.
No, there aren’t. Despite different front-ends (OpenCL based on Clang, C for CUDA based on Open64), both generate PTX intermediate code that is processed by the same driver backend.