Hi All,
I’m doing a very primitive test:
::cudaGetDeviceCount(&nNumDevices);
for (int nDevice = 0; nDevice < nNumDevices; nDevice++)
{
cudaDeviceProp prop;
::cudaGetDeviceProperties(&prop, nDevice);
printf(" Device index: %d\n", nDevice);
printf(" name: %s\n", prop.name);
printf(" tccDriver: %d\n", prop.tccDriver);
printf(" asyncEngineCount: %d\n", prop.asyncEngineCount);
printf(" unifiedAddressing: %d\n", prop.unifiedAddressing);
}
This code runs on a two-cpu server with 8 M2090 cards installed, TCC drivers 369.73, OS WinServer 2016 Datacenter.
Every time I run this code I get different values of asyncEngineCount (either 0 or 1) for different cards. All cards may have asyncEngineCount == 0, a number of cards may have asyncEngineCount == 1 and others have asyncEngineCount == 0, e t c. Each test code pass produces dirrerent results.
Why can this happen?
Thank you.