PGI CUDA for X86 will ship as part of future toolkit?

With Nvidia acquiring PGI it’s not a wild guess that they’ve been planning to makt both OpenACC and their CUDA for X86 compiler available via the toolkit.

What do you think?

While that would be nice, it’s debatable that they’d kill the revenue stream coming from their acquisition, unless I’m missing something. Maybe they’ll make a ‘crippled’ version or one that doesn’t have certain functionality that ships with the toolkit?

I don’t see the CUDA for x86 going free, but we really could use at least one free implementation of OpenACC.

I see your point.

But maybe OpenACC and CUDAx86 aren’t providing their largest revenue streams?

And maybe acquiring PGI was more about their compiler know-how than their current products?

Btw, this is a pure speculation thread if anyone hasn’t notice by now :)

Hah, no kidding. WARNING: Armchair corporate strategizing here! :)

I actually am not sure what the long term strategy for OpenACC is. It follows the same style as OpenMP, relying on #pragma to give the compiler extra information needed to generate efficient code. I don’t know if there is any plan (or if it even makes sense) to try to fuse the standards, given the different programming models between multicore CPUs and data-parallel accelerators.

Certainly, PGI is a win for NVIDIA’s compiler group, regardless of what they decide to do with PGI’s products.

From what I understand OpenMP 4.0 will incorporate a lot of the directives from OpenACC:

“OpenMP, the popular parallel programming standard for high performance computing, is about to come out with a new version incorporating a number of enhancements, the most significant one being support for HPC accelerators. Version 4.0 will include the functionality that was implemented in OpenACC”

“OpenMP evangelist Tim Mattson says the emerging OpenMP accelerator standard is more or less a superset of the OpenACC API”

I will be really curious to see performance comparisons in the future between K20X and Xeon Phi for the same OpenMP codes, I’m guessing both will require tweaking to achieve better performance on the respective architecture.

Ah, that’s nice to see. Hopefully the free OpenMP implementations will start using these directives soon…