Bad performance using MallocPitch and Memcpy2D

Robert_Crovella · May 23, 2017, 2:39pm

Your pitch calculations for indexing are not correct.

Please refer to the documentation:

[url]CUDA Runtime API :: CUDA Toolkit Documentation

The pitch value returned by cudaMallocPitch is a quantity in bytes

Even when you fix that, the pitched method may not give any better performance than the unpitched method. Pitched allocations were especially useful on early GPUs, but are of less significance on modern GPUs. Depending on your GPU, its possible that the overhead associated with pitch calculations in the kernel (especially for such a simple kernel) may outweigh any benefit from pitched access (although it should not cause a ~50x performance reduction)

Topic		Replies	Views
cudamallocpitch and cudamemcpy2d CUDA Programming and Performance	1	1049	October 3, 2010
cudaMallocPitch + cudaMemcpy2D results in 0 ! I use mallocpitch and memcpy2D to copy a matrix to CUDA Programming and Performance	0	2566	June 23, 2011
Can't get copyDeviceToHost to work with cudaMemcpy2D CUDA Programming and Performance	0	3641	November 13, 2009
trouble with cudaMemcpy2D I cant get a matrix to copy into 2D pitched memory CUDA Programming and Performance	1	935	July 13, 2009
problem with cudaMallocPitch and cudaMemcpy2D CUDA Programming and Performance	5	6388	April 22, 2009
cudaMallocPitch CUDA Programming and Performance	5	4532	October 5, 2010
need help for cudaMemcpy2D() CUDA Programming and Performance	5	4604	December 8, 2009
cudaMemcpy2D slow CUDA Programming and Performance	4	5814	January 30, 2009
test on 'cudaMallocPitch' and 'cudaMemcpy2D' CUDA Programming and Performance	1	599	November 16, 2010
2D array & Memory space Mostly about cudaMallocPitch & cudaMemcpy2D CUDA Programming and Performance	1	1500	October 15, 2009

Bad performance using MallocPitch and Memcpy2D

Related topics