OpenCL Blocking occurs when clEnqueueWriteBuffer and clEnqueueNDRangeKernel are used together

lyyaj599 · March 14, 2023, 2:22am

    while(1){
      event_index = 0;
      // 执行kernel A   writebuffer  执行kernel B
      clEnqueueNDRangeKernel(Queue ,kernels[0], 2, NULL, worksize_prox, localsize, 0, NULL, &eventpic[event_index]);
      event_index++;
      err = clEnqueueWriteBuffer(Queue,ori, CL_TRUE , offsetpos, sizeof(char) * roiheight * roiwidth * 2 , picture, 0, NULL, &eventpic[event_index]);
      event_index++;
      clEnqueueNDRangeKernel(Queue , kernels[1], 2, NULL, worksize_prox, localsize, event_index, eventpic, &eventpic[event_index]); 
      clFinish(Queue);
      event_index++;
      for(i = 0; i < event_index; i++) {
        clReleaseEvent(eventpic[i]);
      }
    }

clEnqueueNDRangeKernel kernels1 execution time longer than clEnqueueWriteBuffer
Question: Repeated execution of this code will block

Robert_Crovella · March 14, 2023, 3:42am

If you repeat the calls enough, it is expected behavior that it will block. Work going into the GPU is submitted into a queue system. The queue system does not have infinite depth.

lyyaj599 · March 14, 2023, 7:36am

I changed the code so that the loop does the release each time