Combine CPU and GPU

Hello,

As we know it is possible to parallize calculations on the cpu cores with the parallel class from .NET. We use a for-loop where e.g. 4 cells are analysed parallel in an image by the 4 cpu cores. The question is, it is possible to make calculations on the gpu in the loops function, although it is still parallized by .NET for the cpu`s.

simple Example:

Parallel.For(0,3,i =>

{

  //calling function for analysation

  analize();

});

void analize()

{

  //Do some calculations

  value = calcSome();

}

Is it in principle possible to calculate the calcSome-function on the gpu?

Thanks in advance!!

Hello,

As we know it is possible to parallize calculations on the cpu cores with the parallel class from .NET. We use a for-loop where e.g. 4 cells are analysed parallel in an image by the 4 cpu cores. The question is, it is possible to make calculations on the gpu in the loops function, although it is still parallized by .NET for the cpu`s.

simple Example:

Parallel.For(0,3,i =>

{

  //calling function for analysation

  analize();

});

void analize()

{

  //Do some calculations

  value = calcSome();

}

Is it in principle possible to calculate the calcSome-function on the gpu?

Thanks in advance!!

Yes but you would want many more than 4 iterations on that loop. It would also be a very complicated project to do this. This paper discusses a similar problem. http://portal.acm.org/citation.cfm?id=1504176.1504194

Yes but you would want many more than 4 iterations on that loop. It would also be a very complicated project to do this. This paper discusses a similar problem. http://portal.acm.org/citation.cfm?id=1504176.1504194

The short answer is yes, it is possible to execute multiple concurrent kernels, with some devices with compute capability 2.0. Check section 3.2.6.3 - “Concurrent Kernel Execution” of the Programming Guide and the concurrentKernels example in the SDK.

Depending on the calculations you are doing in the kernel though, it could quite possibly be less efficient to do things this way, and definitely so if your device doesn’t support compute capability 2.0. In that case I would use the parallel CPU code to prepare the kernel input data, converge and launch a single kernel, then use parallel CPU code again to process the output data.

The short answer is yes, it is possible to execute multiple concurrent kernels, with some devices with compute capability 2.0. Check section 3.2.6.3 - “Concurrent Kernel Execution” of the Programming Guide and the concurrentKernels example in the SDK.

Depending on the calculations you are doing in the kernel though, it could quite possibly be less efficient to do things this way, and definitely so if your device doesn’t support compute capability 2.0. In that case I would use the parallel CPU code to prepare the kernel input data, converge and launch a single kernel, then use parallel CPU code again to process the output data.