Hi.
I mixed the cuda “multithreading.cpp” example from the sdk to handle two cuda devices for some test calculation. after that i added a second loop to check how fast the cpu is in comparison to the cuda devices and wondered why only one core is used even if i told them to generate eight threads. After some searching it seems that something of the cuda (2.1beta) reduces the processaffinitymask to 1! if i set it up to values >1 it seems that some things going wrong.
Are there any limiations i didn´t read about regarding multithreading? Or does anybody have similar behaviour and now which part is reponsible for reducing to one core?
(Vista 64 Bit, VS2008)