OpenCL - hmm... not so interesting What is your take on it?

Sarnath · February 16, 2009, 1:14pm

All,

I have been reading through the OpenCL spec. The spec is more ‘C’ oriented as it claims itself.

It would have been far far better if they had made their spec OOP oriented.
Anyway, they just want to abstract the heterogenity of compute devices by exposing them as homegeous compute components.
I feel OOP would have done a better job…

Also, the spec suddenly talks about “image_t”. This is totally deviating from its original purpose.
They must have taken it off to the appendix.

I really think the spec is clumsy and over all irritating. What do you guys think?

I think Microsoft is going to release a .NET based compute language and totally disregard this OpenCL.
Microsoft has promised GPU acceleration for Windows 7. It would be interesting to see what they do.
Do u have any ms links on this?

Best Regards,
Sarnath

kristleifur · February 16, 2009, 1:35pm

Well, you have to get from machine code to OOP somehow. I predict a nice Apple-style layer of OOP niceties on top. This way you could take any different OOP route, be it ObjC or Java or .NET or C++ or whatever.

Not that I’ve read the spec yet … but yeah, you know what I mean.

Excellent topic though!

_Big_Mac · February 16, 2009, 3:23pm

[url=“http://news.cnet.com/8301-13924_3-10119098-64.html”]http://news.cnet.com/8301-13924_3-10119098-64.html[/url]

On the diagram there you can see that there’s a PTX layer beneath nVidia’s implementation of OpenCL. This has sparked a thought - could it be possible to write code basically in CUDA and have it compile to something like an “openCL PTX” that would be portable to ATI/Cell/Larabee etc.? I understand this would need to be limited to using only a subset of “real” CUDA (targeted for nV GPUs).

That or some other high-level wrapping, I’m not a big fan of low-level coding and I imagine a lot of non-CS scientists aren’t either.

kristleifur · February 16, 2009, 3:50pm

I’m a CS scientist I guess - and even I don’t like to be slogging through a lot of low-level code if I don’t need to - I’d like the code to get out of the way of my brilliant CS ideas :)

SPWorley · February 16, 2009, 4:01pm

I’m mostly bummed about the lack of templates in OpenCL. I’ve become extremely fond of them to easily create kernels tuned to specific tasks or profiles, especially in libraries. Mark Harris also uses them similarly a lot on his code (look at his SDK examples, or CUDPP).

OpenCL is back to the days of #define macros and cut-and-pasted code splatted everywhere.

kristleifur · February 16, 2009, 11:55pm

?! No templates ?!

They better have good reasons …

Sarnath · February 17, 2009, 5:11am

Thanks for all your replies.

There is no C++ and hence no templates.

I learnt C++ few months back and then learnt C# and I can see how programming is evolving…

I had bcome a huge MS fan in a span of 3 months.

Now, OpenCL looks to be taking us a step backwards…

OpenCL claims to be the fastest evolved spec…Probably thats the problem. People were in a hurry to get something out that satiates every1. Microsoft did NOT participate in OpenCL. I guess they will do someting on their own. It would be good to see what they come up with.

Best Regards,

Sarnath

E.D_Riedijk · February 17, 2009, 6:02am

Whatever they come up with it will be not portable and it will also not have a benefit of OpenCL. The (only) benefit of OpenCL is that it will run on more devices, not just GPUs. That is where the real strength of OpenCL might be. Also, as far as I understood, you do not specify grid & block sizes. Those are determined by OpenCL. That might also be a reason for not supporting templates.

Sarnath · February 17, 2009, 6:06am

OpenGL and Direct3D survive together. http://en.wikipedia.org/wiki/Comparison_of…GL_and_Direct3D

OpenGL is open ‘C’ based API.

Direct3D is Microsoft API.

I think the implicit way is the AMD way (where the number of threads spawned is detemined by AMD drivers).

The explicit one is CUDA way!

Just my inference from the statement above and my peripheral understanding of AMD streams.

erdooom · February 17, 2009, 7:58am

OpenCL is a bit behind CUDA, but im sure it will get there. I have been using CUDA driver api for a while and must say that i now prefer it to the runtime api. For production products that need to be intigrated into larger systems, it is more convient (at least for me). The only thing im missing is the emulator. Usualy when you are trying to squeeze every last little drop of performance from you hardware you want as low level access as you can, which is always counter balanced by the fact you want some thing more efficient then programing in byte code. I think both CUDA and openCL hit the sweet spot. ATI had brook++ which was to far away from the hardware and you couldn’t get the performance you needed. or CTM which is basically programing in assembler(or ptx…) which isn’t much fun. For compute languages to work there has to be one language that works on all hardware, or else very few will use it. Nvidia is very aware of this and i guess thats why they are fully supporting OpenCL. For my company using CUDA isn’t a problem since we have control over the hardware that our customers use. But for most products it will be critical and thus most will use OpenCL in a year or 2. And of course you will have abstraction layers and all kinds of neat tools, but like OpenGL today. If you are going to do heavy optimization then you have to get down to the lowest level.

Sarnath · February 17, 2009, 8:54am

OpenCL is a bit behind CUDA, but im sure it will get there. I have been using CUDA driver api for a while and must say that i now prefer it to the runtime api. For production products that need to be intigrated into larger systems, it is more convient (at least for me). The only thing im missing is the emulator. Usualy when you are trying to squeeze every last little drop of performance from you hardware you want as low level access as you can, which is always counter balanced by the fact you want some thing more efficient then programing in byte code. I think both CUDA and openCL hit the sweet spot. ATI had brook++ which was to far away from the hardware and you couldn’t get the performance you needed. or CTM which is basically programing in assembler(or ptx…) which isn’t much fun. For compute languages to work there has to be one language that works on all hardware, or else very few will use it. Nvidia is very aware of this and i guess thats why they are fully supporting OpenCL. For my company using CUDA isn’t a problem since we have control over the hardware that our customers use. But for most products it will be critical and thus most will use OpenCL in a year or 2. And of course you will have abstraction layers and all kinds of neat tools, but like OpenGL today. If you are going to do heavy optimization then you have to get down to the lowest level.

Thanks for your note.

btw,

I read in some computer book (“Zen of graphics programming?”) – “The best optimizer is in between your ears”.

Unless you are using the most optimal parallel algorithm, there is no point in diving into low level details for performance.

We had this experience. We were getting around 70x to 80x performance with an algorithm. Then we worked on designing an entirely new one and that one gave us 120x to 220x peak… We did not even get our hands on de-cuda.

Jusss fyi.

Sarnath · February 17, 2009, 12:35pm

[url=“http://research.microsoft.com/en-us/projects/Accelerator/”]http://research.microsoft.com/en-us/projects/Accelerator/[/url]

Has any1 tried “accelerator” from microsoft??

It is .NET based Data-Parallel Library that can use the GPUs transparently (using DirectX) to accelerate your application.
The application does NOT need to know anything about GPU.

This was released way back in 2007 though.

And, it achieves lesser than 50% of what native code achieves.

Not sure if microsoft would pursue this standard though…

kristleifur · February 17, 2009, 1:21pm

If that is their reasoning, then IMO it is reaching into backwardness.

Templates are not really a fixture of C++, not in spirit, but rather a way of expressing ideas much more cleanly than macros, and it’s not that much more complicated for a compiler, nor as a binary runtime if carefully executed.

erdooom · February 19, 2009, 1:19pm

@sarnath: I agree, we initially had a very low speedup. And only after rewriting parts of our algorithm did we get a significant speedup. But that aproch would have been impossible with higher level access (trust me i checked it out) like for example with brook+. But again in that sense there is very little difference between CUDA and OpenCL, its just that it will take a while for OpenCL to catch up, hopefully the fact that its a open standard and has a comity won’t hinder its progress (like what happened to OpenGL).

Sarnath · February 20, 2009, 4:18am

I see.

Were you using AMD (since you talk about Brooks) streams or CUDA ?

I skimmed through AMD streams and read through the Brooks spec (very informal spec). Somehow, I am not convinced about their design. It is all the more confusing. Comes nowhere to CUDA. CUDA is much more elegant by design. May b, Since I did not read deep, my understanding cud be very peripheral. Do you have any experience with AMD streams? How did you find it?

erdooom · February 20, 2009, 7:20am

I started with both (actually also looked into the cell) And we decided in the end to go with CUDA. I agree with what you said about brook, i guess AMD pretty much thinks so as well since it stopped developing it and is now going with OpenCL.

Sarnath · February 20, 2009, 7:27am

Similar case here too. We looked into Cell and CUDA and zeroed in on CUDA. CELL is too costly (IBM Cell blade) for the performance it offers. And runs only Linux. Though there are PCI-E based CELL accelerators from Mercury Systems, they are priced somewhere @ 7000 USD as I remember vaguely.

I dont think AMD gave up on Streams. It was there lying un-attended for sometime… but then they woke up , released a CAL (compute abstract layer) , did a press release (AMD does this one well) and promoted it as I understand. But as I read through AMD spec, I did not find the design appealing. It was not abstracting the graphics as good as CUDA does.

Topic		Replies	Views
OpenCL or CUDA? CUDA Programming and Performance	16	10944	October 26, 2011
Cuda OpenCL comparison cuda, openCL, nvidia CUDA Programming and Performance	19	42591	November 1, 2012
Any reason to choose CUDA over OpenCL? CUDA Programming and Performance	27	26035	August 2, 2010
Cuda vs OpenCL CUDA Programming and Performance	49	262163	December 28, 2008
Cross-vendor GPU development strategy CUDA Programming and Performance	20	6629	January 11, 2010
nvcc: C99 standard in CUDA frontend? CUDA Setup and Installation	21	5726	November 1, 2019
Bootstrapping with OpenCL! Advice me please CUDA Programming and Performance	21	4697	July 15, 2010
Career in CUDA and the future of Parallel programming CUDA Programming and Performance	9	7312	August 12, 2009
C++ kernel language hope of incoming support? CUDA Programming and Performance	9	10488	February 6, 2012
Looking for CUDA apps that can use more than 1 GPU. CUDA Programming and Performance	41	12987	December 9, 2009

OpenCL - hmm... not so interesting What is your take on it?

Related topics