Unexpected CPU usage when using cudaGraphicsMapResouces

I have some working code that needs optimizing and am not making much progress. There is a modified image sitting in GPU memory, this image is being updated constantly. A pointer to the image is created using cudaIpcOpenMemHandle. Then calls to map/unmap the image, bind, and display are made. The issue I am trying to resolve is that the map/unmap calls seem to be pushing my CPU usage up to 60%. The goal is to reduce this CPU usage since the image I want to display is already in GPU memory. Any ideas on how to reduce the load on the CPU?

//init: register this buffer object with CUDA
checkCudaErrors(cudaGraphicsGLRegisterBuffer(cuda_pbo_dest_resource, *pbo,                   cudaGraphicsMapFlagsNone));
//get pointer to image in GPU memory
  cudaError_t err =  cudaIpcOpenMemHandle((void **)&m_deviceBuffer, m_cudaHandle,
void updateImage {
  //map & unmap 
  //this seems to be where the CPU usage is increased.
  cudaGraphicsMapResources(1, &cuda_pbo_dest_resource, 0);
  cudaGraphicsResourceGetMappedPointer((void **)&out_data, &num_bytes,
  cudaMemcpyAsync(out_data, m_deviceBuffer, m_imageBufferSize, cudaMemcpyDeviceToDevice);
  cudaGraphicsUnmapResources(1, &cuda_pbo_dest_resource, 0);

  //bind and display image next

One straightforward optimization approach would be to minimize the number of map/unmap, bin/unbind cycles by re-using existing mappings. Try a double-buffered or triple-buffered approach. Additionally, try using a faster CPU. High-end quad core Haswell CPUs are quite affordable these days, e.g. Xeon E3-1271 v3 or i7 4790K.