Kernel launch overhead due to WDDM - some questions

Hi, i’m curious whether the significant CUDA kernel launch overhead on Windows Vista / Windows 7 (see affects also Quadro cards, or only Geforce cards. Could be likely a downside for Quadro’s, as they are marketed by NVIDIA also as ‘computational’ device (many programs in DCC, post production - where Quadro is de-facto standard - now use CUDA acceleration). And a lot of basic GPU functions (e.g. stuff from CUDPP, Thrust like reduction, compaction etc.) has lot of kernel calls. Furthermore I’m interested, whether this WDDM kernel launch overhead is going to decrease in Windows 8 - some preliminary information by NVIDIA guys maybe ?