Problem porting C870 code onto S1070

We’ve just obtained a S1070 and I’ve ported some code that was developed on a C870 onto it to get started but I can’t get passed initialization of the data.

I am getting an error message from the very first kernel
“setting the device when a process is active is not allowed”.

This is a new one to me, so

  1. how can I tell if there is a process on a device?
  2. when could I have started that process?
  3. how can I stop it?

I’ve tracked the error to allocating memory on the device.

The following code

[codebox]

allocateArray((void**)&d_x, memSizefloat2);printf(“\nInitialize 1:%s\n”,cudaGetErrorString(cudaGetLastError()));

allocateArray((void**)&d_vx, memSizefloat2);printf(“\nInitialize 2: %s\n”,cudaGetErrorString(cudaGetLastError()));[/codebox]

gives the output

Initialize 1: no error

Initialize 2: setting the device while a process is active is not allowed

The problem was that in the allocateArray function I called cudaSetDevice before cudaMalloc and I’ve read in a post on a similar topic that calling cudaSetDevice when using later versions of CUDA can cause a failure.

So be careful. Porting from one device to another is not straightforward.

Check out “cudaSetDeviceFlags” API – There is a way to say that a devie can be used ONLY by ONE process…

And, there was an NVIDIA utility to set this correctly(smi utility or whatever…)…

May b, Tim might be able to give you more details on that (assuming thats the prob)

I’ve sorted the problem now and it was those calls to cudaSetDevice.

Code timings:

on C870 took 110s (which was blindingly quick anyway)

on C1060 took 42s

WOW!

I’m looking forward to using all devices in the S1070 in parallel.