[newbie] copy 2d array of chars host to device

Hi all!

I am new to CUDA development.

I am trying to copy a 2D array of chars from the host memory to the device memory in order to perform some computation.

Unfortunately I did not manage to make it work.

Here is the code I am using:

char* table[3][3] = { {"col1", "col2", "colxxx"}, 

                     {"test1", "test2", "test3"},

                     {"test4", "test5", "test6"} };

// Host code

    int width = 3, height = 3;

    float* devPtr;

    size_t pitch;

// [...]

cudaMallocPitch(&devPtr, &pitch, width * sizeof(char*), height);

    cudaMemcpy2D(devPtr, pitch, table, width*sizeof(char*), width*sizeof(char*), height, cudaMemcpyHostToDevice);

MyKernel<<<1, 1>>>(devPtr, pitch, width, height);

and the kernel code (which should simply print out the content of the matrix):

__global__ void MyKernel(float* devPtr,size_t pitch,int width, int height)


    cuPrintf("Height: %d\n", height);

    for (int r = 0; r < height; ++r) {

        char** row = (char**)((char*)devPtr + r * pitch);

        for (int c = 0; c < width; ++c) {

            char* element = row[c];

            cuPrintf("%s\n", &element);




I receive in output the string “Height: 4” and 12 empty lines.

I think the problem is that I am not actually copying the strings to the device, but only the pointers to these ones.

How could I solve this problem? Should I use 3D-arrays instead of 2D ones?