Cuda Vs. OpenGL

I have a query.

For example,

If I implement matrix multiplication using CUDA and using OpenGL, which implementation will give better performance?

If OpenGL implementation gives better performance, could you please explain the reasons?

It is the respective features of CUDA and OpenGL that matter, not anything else. A program that doesn’t care either way about features will run about the same. It might even be faster in OpenGL because the compiler and driver are much more mature.

There was a thread just now where a person wanted to sample a rotated 3d cube. OpenGL will be much faster in this case because it exposes a feature of the graphics card, 3d textures, that is not exposed in CUDA. A lot of other graphics-centric things like complex filtering modes, antialiasing, lossyly compressed memory are missing in CUDA. But for most other tasks, CUDA’s functionality will be much more relevant.

However, none of the above is relevant in most cases because it most often comes down to chance, especially for beginner programmers. Only if the programmer is so clever as to usually use the most optimal algorithm, will the inherent differences between CUDA and OpenGL actually be the dominant factors. Then again, probably CUDA’s reduced complexity leads to more time (and more room in the head) for optimizing. That is its main point, after all. That, and ordinary, arbitrary access to memory.

As for matrix multiplication… I’ll say it’s probably better in CUDA, although directx 10 (which i’ll assume is the same as opengl) is also a contender because it too has a form of shared memory plus many extra memory reuse tricks that didn’t make it to CUDA.