I have some questions about the Tesla compute cluster driver:
-
“Reducing kernel launch overhead” - how much does this help? Does the overhead have anything to do with the ~10us I found here?
-
The notes say you have to use a non-NVIDIA display driver if you want a display, but why? I know Windows Display Driver Model 1.0 (pre Windows 7) only supports 1 driver, but WDDM 1.1 supports > 1 driver. This would be a major inconvenience, because currently I use a Quadro 290 for the display.
Good, that’s what I need for my median / SelectNth code with lots of global synchronization (kernel launches)
Good, that’s what I need for my median / SelectNth code with lots of global synchronization (kernel launches)
Yeah if you’re doing a loop of kernel → memcpy → check to see if a convergence condition is met → repeat, TCC is going to kill WDDM in terms of performance here.
Yeah if you’re doing a loop of kernel → memcpy → check to see if a convergence condition is met → repeat, TCC is going to kill WDDM in terms of performance here.