Dereferencing pitched memory

warpstar22 · December 9, 2020, 9:30pm

I have a cuda array using cudaMallocPitch. The array is 2d with the following struct as its type:

typedef struct min_max_t {
    unsigned char min;
    unsigned char max;
} min_max_t

The array is allocated like so:

min_max_t *d_min_max_matrix;
size_t pitch;
cudaMallocPitch(&d_min_max_matrix, &pitch, 8 *sizeof(min_max_t), 8);
for(int y = 0; y < 8; ++y) {
    void *d_row = (char *) d_min_max_matrix + y * pitch;
    min_max_t *h_row = min_max_matrix[y]; //where min_max_matrix is a min_max_t** that is allocated on the host

    cudaMemcpy(d_row, h_row, 8 *sizeof(min_max_t), cudaMemcpyHostToDevice);
}

Now I understand that getting the desired row is only possible with using the pitch value:
min_max_t *d_min_max_matrix_row = (min_max_t *) ( (char *) d_min_max_matrix + y * pitch);

But I don’t understand how to get the value in the x direction that I need other than using array dereference.

This works:
d_min_max_matrix_row[x];

But I can’t get this to work:
min_max_t *d_min_max = d_min_max_matrix_row + x * sizeof(min_max_t);
I’m not sure how you are supposed to dereference it.

Robert_Crovella · December 9, 2020, 9:41pm

When you do pointer arithmetic, the pointer arithmetic is automatically scaled by the size of the thing the pointer points to. This is a c/c++ programming concept, and not unique or specific to CUDA.

Therefore the correct construct to mimic this:

d_min_max_matrix_row[x];

is this:

min_max_t *d_min_max = d_min_max_matrix_row + x;

with those constructs, you should observe that:

*d_min_max ==  d_min_max_matrix_row[x]

warpstar22 · December 10, 2020, 11:10pm

Could you explain why the (char *) is necessary in finding the row then? It seems odd to cast to another pointer and then back to my original pointer, but that’s what I saw everyone else doing in tutorials and forums. I think that threw me off and made me think I had to do more than just d_min_max_matrix_row[x].

Robert_Crovella · December 10, 2020, 11:31pm

because the pitch is specified in bytes. It is not specified in terms of the number of elements. Therefore to use a byte offset to go from one row to the next, it’s necessary that the row pointer first be recast to a byte type pointer, otherwise the pointer arithmetic would not give the desired result.

Topic		Replies	Views
Problem with 2D memory copy using pitch CUDA Programming and Performance	6	6469	November 20, 2011
Pointers array CUDA Programming and Performance	7	5562	July 28, 2009
Cuda Malloc Pitch Doubt on cudaMallocPitch() CUDA Programming and Performance	1	2675	May 24, 2012
cudaMallocPitch returns wrong pitch CUDA Programming and Performance	2	2677	May 8, 2012
Significance of Pitch for Allocation of 2D Arrays CUDA Programming and Performance	3	2004	June 30, 2009
"Pitch" in cudaMallocPitch()? CUDA Programming and Performance	3	4532	March 2, 2009
CUDA 2D Array Problem Need help to manipulate 2D arrays in CUDA CUDA Programming and Performance	4	26439	March 17, 2011
Pitch please explain CUDA Programming and Performance	1	2811	November 4, 2008
Help with cuda 2d array CUDA Programming and Performance	6	7452	September 29, 2014
cudaMalloc2D CUDA Programming and Performance	8	5190	November 15, 2014

Dereferencing pitched memory

Related topics