Speedup cudamemcpy

Hi All,

I have 6x cudaMemcpy calls:

cudaMemcpy2DFromArray( d, dpitch, ( cudaArray * ) eglFrameLeft0.frame.pArray[ 0 ], rectx, recty, rectwidth, rectheight, d2d );
cudaMemcpy2DFromArray( d, dpitch, ( cudaArray * ) eglFrameLeft0.frame.pArray[ 1 ], rectx, recty, rectwidth, rectheight, d2d );
cudaMemcpy2DFromArray( d, dpitch, ( cudaArray * ) eglFrameLeft0.frame.pArray[ 2 ], rectx, recty, rectwidth, rectheight, d2d );
cudaMemcpy2DFromArray( d, dpitch, ( cudaArray * ) eglFrameLeft1.frame.pArray[ 0 ], rectx, recty, rectwidth, rectheight, d2d );
cudaMemcpy2DFromArray( d, dpitch, ( cudaArray * ) eglFrameLeft1.frame.pArray[ 1 ], rectx, recty, rectwidth, rectheight, d2d );
cudaMemcpy2DFromArray( d, dpitch, ( cudaArray * ) eglFrameLeft1.frame.pArray[ 2 ], rectx, recty, rectwidth, rectheight, d2d );

Can i speedup this 6 ops? May be parallel omp?

Best regards, Viktor.

Hi,

Aysnc memory copy is not supported on TX1 due to GPU locking mechanism.
Thanks.

Thank you.