I am investigating making JPEG related solutions using GPU acceleration via NPP. I use the IDCT, DCT, and resize functions. We’re looking at deploying a service, so concurrency is of utmost importance to us to drive throughput. We’re facing some issues.
As I understand, there are two ways of driving concurrency in the CUDA world:
1. Multi Thread and use different CUDA streams.
The problem here is that the IDCT and DCT functions have been marked as “Not thread safe” in the NPP manual. Any idea whether there is some news on these being made thread safe in newer versions of NPP?
2. Use MPS Server and go multi process.
This is the only practical option for us if multi threading is out.
The problem we are facing here is, that the service is to be deployed on Amazon EC2 and the GPU instances there have an NVIDIA GRID K520, which seems to support only SM 3.0, where as MPS Server requires >= SM 3.5.
Anyone faced this issue on Amazon EC2 instances?
Would greatly appreciated some info here.