(OK, this is another “wishful thinking” post which may not get an answer, but would be interesting if it did.)
I saw this article this morning that says that PhysX is getting ported to CUDA: http://techreport.com/discussions.x/14147
This is really great news, especially since I expect it will really exercise the CUDA platform, and probably push more development resources into CUDA improvements.
Can anyone comment on how PhysX on CUDA will coexist with with the 3D rendering while an application is running? Options include:
Interleaved kernels which get 100% of card resources: This is what happens now with CUDA on a graphics device. The CUDA kernel owns the card during execution, and the graphics driver does updates between calls.
Inter-card partitioning: One card is assigned to 3D rendering, and one to PhysX/CUDA. This is also what you can do now if you turn off SLI and have multiple cards.
Intra-card partitioning: Dividing the multiprocessors of a single card between 3D and CUDA, so that kernels will run on a subset of the available multiprocessors, and can be issued independent of the kernels or 3D rendering happening on the rest of the card.
I don’t even know if the hardware is capable of the last scenario, but I ask because it has interesting implications for CUDA. It would be a natural extension to the stream API in CUDA to be able to set the multiprocessor count desired for different streams. Then streams could actually run concurrently, rather than interleaved, on the same card.