Most iportant from GTC Cuda on x86 hello emulation mode

Lev · September 22, 2010, 9:17pm

[url=“Breaking News: Jen-Hsun Announces CUDA for x86 Architecture - BSN*”]http://www.brightsideofnews.com/news/2010/...chitecture.aspx[/url]

cbuchner1 · September 22, 2010, 11:47pm

According to Heise News, there will be a commercial compiler for CUDA code by Portland Group (PGI) available (or officially intruduced) starting November 13th. Do you really think x86 CUDA will become part of the toolkit? I doubt that.

http://www.heise.de/newsticker/meldung/GTC…ig-1083447.html

cbuchner1 · September 22, 2010, 11:47pm

According to Heise News, there will be a commercial compiler for CUDA code by Portland Group (PGI) available (or officially intruduced) starting November 13th. Do you really think x86 CUDA will become part of the toolkit? I doubt that.

http://www.heise.de/newsticker/meldung/GTC…ig-1083447.html

E.D_Riedijk · September 23, 2010, 7:35am

This is also no emulation mode as it was before. It is a conversion from CUDA C to x86 machine code.

E.D_Riedijk · September 23, 2010, 7:35am

This is also no emulation mode as it was before. It is a conversion from CUDA C to x86 machine code.

eyalhir74 · September 23, 2010, 8:54am

I’ve recently moved to 3.2… I just dont have enough words to say how much I miss emulation mode…

it was just perfect :)

eyal

eyalhir74 · September 23, 2010, 8:54am

I’ve recently moved to 3.2… I just dont have enough words to say how much I miss emulation mode…

it was just perfect :)

eyal

Lev · September 23, 2010, 11:56am

So your cuda program is run same way on x86, seems you can debug it etc on x86. You compile your code to x86, sounds like it is better than emulation mode cause it is faster and more precise.

Lev · September 23, 2010, 11:56am

So your cuda program is run same way on x86, seems you can debug it etc on x86. You compile your code to x86, sounds like it is better than emulation mode cause it is faster and more precise.

Lev · September 23, 2010, 12:00pm

Maybe it will have free licence for debug purpose. I.e. if you do not distribute your program with it, just use it for debug. I think it is good solution. Those who want thier cuda program run on x86 may buy licence.

Lev · September 23, 2010, 12:00pm

Maybe it will have free licence for debug purpose. I.e. if you do not distribute your program with it, just use it for debug. I think it is good solution. Those who want thier cuda program run on x86 may buy licence.

Ken_Domino · September 23, 2010, 12:42pm

Are there any details known yet? For example, are they going to “loosely” integrate an x86 processor to the “GPU” (not sure what it will be called then), having direct access to the Interconnection Network along with the TPC’s? Or, are they going to tightly integrate x86 by somehow replacing the SP’s in the TPC’s with x86-like stream processors, replacing or adding to PTX with x86?

Ken_Domino · September 23, 2010, 12:42pm

Are there any details known yet? For example, are they going to “loosely” integrate an x86 processor to the “GPU” (not sure what it will be called then), having direct access to the Interconnection Network along with the TPC’s? Or, are they going to tightly integrate x86 by somehow replacing the SP’s in the TPC’s with x86-like stream processors, replacing or adding to PTX with x86?

E.D_Riedijk · September 23, 2010, 12:49pm

No, they will just compile CUDA code to x86 binaries. It will not run on a gpu, but on the cpu.

E.D_Riedijk · September 23, 2010, 12:49pm

No, they will just compile CUDA code to x86 binaries. It will not run on a gpu, but on the cpu.

seibert · September 23, 2010, 1:50pm

I agree with E.D. Riedijk. This is almost certainly a commercially supported compiler that does what many other academic projects have been dabbling in for years: Take CUDA source code and generate multithreaded SSE x86 code. If done well, I bet a lot of people would find that CUDA on x86 is faster than even their normal CPU implementations. (Because most compilers are terrible at generating SSE instructions from all but the simplest C code and most of us are terrible at writing SSE by hand.)

seibert · September 23, 2010, 1:50pm

I agree with E.D. Riedijk. This is almost certainly a commercially supported compiler that does what many other academic projects have been dabbling in for years: Take CUDA source code and generate multithreaded SSE x86 code. If done well, I bet a lot of people would find that CUDA on x86 is faster than even their normal CPU implementations. (Because most compilers are terrible at generating SSE instructions from all but the simplest C code and most of us are terrible at writing SSE by hand.)

E.D_Riedijk · September 23, 2010, 3:11pm

And it might make OpenCL a lot less attractive from a hybrid computing perspective. It will be very interesting to see the performance difference between OpenCL on multicore processors and CUDA code compiled with this compiler, and also the amount of tweaking required for both.

E.D_Riedijk · September 23, 2010, 3:11pm

And it might make OpenCL a lot less attractive from a hybrid computing perspective. It will be very interesting to see the performance difference between OpenCL on multicore processors and CUDA code compiled with this compiler, and also the amount of tweaking required for both.

SPWorley · September 23, 2010, 3:54pm

I wonder if it uses a warp size of 4 or of 32.

I also hope it’s very CPU locality aware, trying to keep threads from the same block running on the same physical CPU to improve cache coherence. That gets tricky when you’re creating and destroying new blocks all the time.