I’m using in-order-execution command queue with non-blocking read and write buffer. According to the OpenCl documentation, it’s okay to do so because as long as I enqueue the command in the right order, the command queue will execute one-after-another in the order they’re put in the queue. However, sometimes I got wrong output, but most of the time it produces the correct output, so I try using blocking write and non-blocking read, and it always produces the correct output. Anyone know why that is.
I’m not sure if it’s related to this fact. In my code, I’ve never released cl_program, cl_kernel, cl_command_queue, or cl_context, but I do release cl_mem. If you wonder why I’m not doing so, it’s because I’m writing a compiler that generate opencl code, so it’s not a straight forward thing to figure our where is the right place to release things, but releasing is very crucial, I will do that as the first priority.
The following code is equivalent to what I’m implementing. I can’t post the real version because it’s too complicated.
float input[N] = …;
float output[N];
cl_context context = clCreateContext(0, 1, &device_id, NULL, NULL, &err); //in-order-execution command queue
cl_command_queue queue = clCreateCommandQueue( context, device_id, 0, &err );
cl_mem _clmem1 = clCreateBuffer(context, CL_MEM_READ_WRITE, n * sizeof(float), NULL, &err);
cl_mem _clmem2 = clCreateBuffer(context, CL_MEM_READ_WRITE, n * sizeof(float), NULL, &err);
clEnqueueWriteBuffer(queue, _clmem1, CL_FALSE, 0, bytes(), input, 0, NULL, NULL); //non-blocking write
clSetKernelArg(clkern, 0, sizeof(cl_mem), &_clmem1);
clSetKernelArg(clkern, 1, sizeof(cl_mem), &_clmem2);
size_t workdim = {N};
clEnqueueNDRangeKernel(queue, clkern, 1, 0, workdim, NULL, 0, NULL, NULL );
clEnqueueReadBuffer(queue, _clmem, CL_FALSE, 0, bytes(), output, 0, NULL, &eventout); //non-blocking read
{
clGetEventInfo(eventout, CL_EVENT_COMMAND_EXECUTION_STATUS, sizeof(cl_int), &ret, NULL);
}while(ret != CL_COMPLETE);
print output
clReleaseMemObject(_clmem1);
clReleaseMemObject(_clmem2);
//Not releasing anything else
When I change non-blocking write to blocking write, it works fine. Please help!