PhysX and Concurrent kernels in CUDA

(OK, this is another “wishful thinking” post which may not get an answer, but would be interesting if it did.)

I saw this article this morning that says that PhysX is getting ported to CUDA: http://techreport.com/discussions.x/14147
This is really great news, especially since I expect it will really exercise the CUDA platform, and probably push more development resources into CUDA improvements.

Can anyone comment on how PhysX on CUDA will coexist with with the 3D rendering while an application is running? Options include:

  • Interleaved kernels which get 100% of card resources: This is what happens now with CUDA on a graphics device. The CUDA kernel owns the card during execution, and the graphics driver does updates between calls.

  • Inter-card partitioning: One card is assigned to 3D rendering, and one to PhysX/CUDA. This is also what you can do now if you turn off SLI and have multiple cards.

  • Intra-card partitioning: Dividing the multiprocessors of a single card between 3D and CUDA, so that kernels will run on a subset of the available multiprocessors, and can be issued independent of the kernels or 3D rendering happening on the rest of the card.

I don’t even know if the hardware is capable of the last scenario, but I ask because it has interesting implications for CUDA. It would be a natural extension to the stream API in CUDA to be able to set the multiprocessor count desired for different streams. Then streams could actually run concurrently, rather than interleaved, on the same card.

I hope graphics and physics run at the same time on one card as such would be necessary to make the Ageia acquisition revolutionary. Everyone needs just a GeForce 8 and they can instantaneously get a boost in performance in physics demanding games that use PhysX code.

I hope this means that CUDA support for the Vista operating system (32 and 64 bit) is just around the corner. I can’t imagine PhysX getting released without support for Vista. I am, of course, presuming that PhysX is based on CUDA.

Nvidia commented that its port of physx to CUDA would encourage buyers to buy 2 or more graphics cards which means that they are aiming for multiple Graphics card setup.

Things are not so bad, actually.
ATI recently released 3870x2 which has 2 GPUs on single board, probably SLIed (CrossFired) together. I’m pretty much sure NVIDIA will follow this and release Double-GPU card soon. So, one such board will be able to hadle both 3D and physics.

Intra-card partitioning sounds interesting but I really doubt it is possible:( And even if it is, balancing load between 3D and physics (i.e. determining how much multiprocessors to dedicate to 3D and how much to physics) would not be a trivial task.

I am still doubtful if I will be able to use 2x8800GTS SLI with a 8600GT where the 8800GTS SLI would be used for graphics and 8600GT for CUDA implemented physics on a 680i board with 3 PCI-EX slots.