OpenCL YUV Motion Estimaiton ...help! how to use OpenCL to process video data

Hello everyone,

I am OpenCL learner, and now i want to process the YUV data (format 4:1:1,planar) with le full search Motion Estimation algorithms. The test video is CITY.yuv (size: 352 x 288).

I know that how to process one frame or image in one CommandQueue, but i don’t konw how to process the frame sequence with OpenCL.

As i do, i try to use one for-loop to deal with all the frames in one video, the codes are as follows:

// --------------------------------------------------------

// Core sequence... copy input data to GPU, compute, copy results back

//suppose there is 100 frames.

for(num=0;num<1000;num++)

{

         int i,j;

	//Read and Copy pixel data into frame_original     

             fseek(fp_cur,XX*YY*3/2*(num+1),SEEK_SET);

	if(fread(current_frame[0],XX*YY,1,fp_cur)==0)break;

	memcpy(frame_original, current_frame[0], width * height * sizeof(cl_uchar));

            //Read and Copy pixel data into frame_ref		

             fseek(fp_ref,XX*YY*3/2*(num+0),SEEK_SET);

	if(fread(ref_frame[0],XX*YY,1,fp_ref)==0)break;

             memcpy(frame_ref, ref_frame[0], width * height * sizeof(cl_uchar));

	// Asynchronous write of data to GPU device 

	ciErrNum = clEnqueueWriteBuffer(cqCommandQue, cm_original , CL_TRUE, 0, width*height*sizeof(cl_uchar), frame_original, 0, NULL, &H2D);

	shrCheckErrorEX (ciErrNum, CL_SUCCESS, pCleanup);

	ciErrNum = clEnqueueWriteBuffer(cqCommandQue, cm_ref , CL_FALSE, 0, width*height*sizeof(cl_uchar), frame_ref, 0, NULL, NULL);

	shrCheckErrorEX (ciErrNum, CL_SUCCESS, pCleanup);

	//Create the kernel

	ckKernel = clCreateKernel(cpProgram,"motion_estimation", &ciErrNum);

            shrCheckErrorEX (ciErrNum, CL_SUCCESS, pCleanup);

	// set the global size for each block in frame. 

	global[0]= width/block_size;

	global[1]= height/block_size;

	// set the kernel arguments

        int n = 0;

        printf("clSetKernelArg...\n\n");

    ciErrNum = clSetKernelArg(ckKernel,  n++, sizeof(cl_mem), (void*)&cm_original);

    ciErrNum |= clSetKernelArg(ckKernel, n++, sizeof(cl_mem), (void*)&cm_ref);

	ciErrNum |= clSetKernelArg(ckKernel, n++, sizeof(cl_int), (void*)&cm_mv_output);

// Launch kernel

    ciErrNum = clEnqueueNDRangeKernel(cqCommandQue, ckKernel, 2, NULL,global,NULL,0, NULL, &ceEvent);

    shrCheckErrorEX (ciErrNum, CL_SUCCESS, pCleanup);	        

// wait for the command to finish

	clFinish(cqCommandQue);

// Read back results and check accumulated

    ciErrNum = clEnqueueReadBuffer(cqCommandQue, cm_mv_output, CL_TRUE, 0, X*Y*sizeof(MV), MV_output, 0, NULL, &GPUDone);

    shrCheckErrorEX (ciErrNum, CL_SUCCESS, pCleanup);

}

but, everytime i debug my code,

there is Unhandled exception at 0x1002f9ba in xxx.exe: 0xC0000005: Access violation reading location 0x00000004 at this line:

ciErrNum = clEnqueueNDRangeKernel(cqCommandQue, ckKernel, 2, NULL,global,NULL,0, NULL, &ceEvent);

I konw that there must be a better way to process the yuv data, if you have any idea or opinion, please let me konw.

Thanks for your help.