cudaMallocPitch is giving inconsistent result cudaMallocPitch is giving inconsistent r

punit · June 27, 2008, 5:09am

//refernce code
char I_frame_ptr,P_frame_ptr;
size_t I_frame_pitch=0,P_frame_pitch=0;
cudaError_t a,b;
a = cudaMallocPitch ((void) &I_frame_ptr, &I_frame_pitch, ROW_PIXELsizeof(char),COLUMN_PIXEL);
b = cudaMallocPitch ((void**) &P_frame_ptr, &P_frame_pitch, ROW_PIXELsizeof(char),COLUMN_PIXEL);
printf (“%d\t%d\n”,I_frame_pitch,P_frame_pitch);
printf(cudaGetErrorString(a));
printf(cudaGetErrorString(b));
//CODE ends

When I define ROW_PIXEL as 640, it is printig I_frame_pitch and P_frame_pitch both as 640, moment I increase the ROW_PIXEL to 656 it prints 704, furher I changes ROW_PIXEL to 720 it prints 768. Again when I set ROW_PIXEL as 1024 it prints 1024…I cannot understand what is happening here. I want it to work correctly for 720 !

Sibi_A · June 27, 2008, 6:00am

There is no problem having a different pitch anyway. You can safely use the pitch returned by cudaMallocPitch to access elements in your array.

The problem you are getting a different pitch may be because of implicit padding done by cudaMallocPitch. Programming guide states about this in section

4.5.2.3

Note: Also check the memPitch value returned by cudaDeviceProp function during cuda initialization.

mandrak · June 27, 2008, 6:45am

Everything is ok except 656->704 …Check again maybe you did a typo.

Pitch represents new width to satisfy alignment requirements. Yor image of 720 bytes (i said bytes because you use sizeof(char)) in width must be extended to 768 bytes because width must be the product of 128. Or by other words

width % 128 == 0

Imagine that like your 2D image is stored in 1D memory buffer as sequence of rows.

First row followed with second row and so on.

To achieve maximal performance, beginning of each row must be at aligned address in that buffer. It is possible only if width%128==0 but it is not the case with 720 bytes, so extra 48 bytes are inserted after each row. Such new pitched width is in Pitch variable.

Accesing byte from coordinate (x,y) is easy

if(x<720) YourByte = buffer[y * Pitch + x]

Condition is used to allow processing only of bytes from image and not inserted.

Pixelsize is 1 in your case otherwise line would be

if(x<ImageWidth) YourByte = buffer[y * Pitch + x * PixelSize]

kristleifur · June 27, 2008, 9:53am

I think even 704 is an OK pitch:

704 / 16 = 44

704 / (16 * 4) = 11

so it’s aligned

mandrak · June 27, 2008, 5:35pm

It is true if CUDA alignment requirements is 64 and not 128 (I’m not sure about that can someone confirm).

Then it means every width where

width % 64 != 0

must be enlarged to the first larger value which could be divided by 64 and that value is returned in Pitch variable.

However, Punit got a picture why Pitch can not be 720 as he expected.

kristleifur · June 28, 2008, 12:23pm

I’m not sure, but I think it can depend on what compute capability. The cards I’ve worked with have coalcesced access to 16 * 4-word patterns = 64 bytes. AFAIK. 128 may be safer for future cards or something like that.

Topic		Replies	Views
cudaMallocPitch returns wrong pitch CUDA Programming and Performance	2	2727	May 8, 2012
cudaMallocPitch() CUDA Programming and Performance	1	6329	December 26, 2009
Cuda Malloc Pitch Doubt on cudaMallocPitch() CUDA Programming and Performance	1	2705	May 24, 2012
Returned pitch in cudaMallocPitch CUDA Programming and Performance	4	2323	October 11, 2014
cudaMallocPitch() CUDA Programming and Performance	2	2664	October 21, 2009
cudaMallocPitch : pitch size Will the pitch be different for two arrays with same dimension ? CUDA Programming and Performance	0	759	April 15, 2011
Understanding Memory Pitch Alignment CUDA Programming and Performance	9	12212	October 13, 2015
cudaMallocPitch() CUDA Programming and Performance	0	3410	October 20, 2009
Predicting the pitch returned by cudaMallocPitch CUDA Programming and Performance	7	5982	July 27, 2023
Problem with 2D memory copy using pitch CUDA Programming and Performance	6	6574	November 20, 2011

cudaMallocPitch is giving inconsistent result cudaMallocPitch is giving inconsistent r

Related topics