CPU Support?

Are there any plans to support running CUDA programms on Multi-Core CPUs? I heard about it a year ago or so. Or does anybody have experience with MCUDA (http://impact.crhc.illinois.edu/publications.php)?

Background:
I’ll start with my diploma thesis next month. Aim is to extend a low-level streaming image processing library/framework to support GPUs and Multicore-CPUs. So it would be very cool to write the filters only once for CUDA and get the CPU-Version for free.

Hi!

Perhaps OpenCL is what you’re looking for.

Regards

Navier

Yes, OpenCL looks very interesting. The problem that I see, is that there is no implementation for GPUs or even CPUs out there right now. Or does anyone has experience with the Nvidia implementation?

Maybe the best way is to use CUDA first with OpenCL in mind and port the library to OpenCL when implementations are available!?

Indeed there’s an Early Access Program for OpenCL.

Hi

I think CUDA Emulation mode work well on Multi-Core CPUs, and you can use it as CPU version.

Actually the device emulation is quite fantastically slow. Seriously.

Nobody knows and NVIDIA is not saying anything about it. I recently talked to a student who worked on this at NVIDIA as an intern and he doesn’t even have any up to date news. We know it’s not in CUDA 2.2, that is about the most concrete thing we do know.

Probably: With OpenCL around, it would be a wasted effort to do this in CUDA. When CPU vendors are doing it, why should they even bother about it.

Device emulation will work reasonably fast as any other CPU implementation (provided the thread spawning overhead is NOT comparable to thread-execution time). That is not the main problem.

Device emulation can produce INCORRECT results. It is not a full emulation of hardware. So, be aware. It should be used as a debugging tool and nothing more.

Try MCUDA :)

Have you ever tried device emulation? It is an order of magnitude or two slower than the simplest unoptimized C code loop to do the same algorithm. Device emulation is just for debugging.

Used it. but never profiled it. Technically, I did not see this as a possibility. Intriguing… Sorry for giving wrong info.

Thank’s for the answers. I think I’ll take a closer look on OpenCL, it looks much like CUDA.

Yeah, on your single-core.

I have already apologized for this wrong info. Check above.

Sorry, I did not see your last sentence.