Hi All,
I have 6x cudaMemcpy calls:
cudaMemcpy2DFromArray( d, dpitch, ( cudaArray * ) eglFrameLeft0.frame.pArray[ 0 ], rectx, recty, rectwidth, rectheight, d2d );
cudaMemcpy2DFromArray( d, dpitch, ( cudaArray * ) eglFrameLeft0.frame.pArray[ 1 ], rectx, recty, rectwidth, rectheight, d2d );
cudaMemcpy2DFromArray( d, dpitch, ( cudaArray * ) eglFrameLeft0.frame.pArray[ 2 ], rectx, recty, rectwidth, rectheight, d2d );
cudaMemcpy2DFromArray( d, dpitch, ( cudaArray * ) eglFrameLeft1.frame.pArray[ 0 ], rectx, recty, rectwidth, rectheight, d2d );
cudaMemcpy2DFromArray( d, dpitch, ( cudaArray * ) eglFrameLeft1.frame.pArray[ 1 ], rectx, recty, rectwidth, rectheight, d2d );
cudaMemcpy2DFromArray( d, dpitch, ( cudaArray * ) eglFrameLeft1.frame.pArray[ 2 ], rectx, recty, rectwidth, rectheight, d2d );
Can i speedup this 6 ops? May be parallel omp?
Best regards, Viktor.