Dynamically allocate array in device memory with varying row width.

Hi. I’m new to CUDA and I’m having a small problem.

Let’s say I have a dynamically allocated array of strings of unsigned chars like this.

unsigned char** querySeqs;

querySeqs = malloc(seqNum * sizeof(unsigned char*));

for (i = 0; i < seqNum; i++) {

  querySeqs[i] = malloc(seqLength[i] * sizeof(unsigned char));

  .. //code to get the content of querySeqs[i].

}

The seqNum strings (or sequences) in this array all have varying lengths, and are dynamically allocated accordingly so that no space is wasted. The lengths of the sequences are stored in seqLength.

Now I want to get this array onto device memory. However I haven’t found a neat way to do this. I tried something like this.

unsigned char** dvQuerySeqs;

cudaMalloc(&dvQuerySeqs, seqNum * sizeof(unsigned char*));

for (i = 0; i < seqNum; i++) {

  cudaMalloc(&(dvQuerySeqs[i]), seqLength[i] * sizeof(unsigned char));

  .. //cudaMemcpy.

}

Apparently, it didn’t work since I can’t access dvQuerySeqs[i] in host code. I tried cudaMallocPitch, cudaArray and texture but all of them are rectangular matrices and the labor to copy querySeqs correctly to these formats discouraged me. Plus I want to keep the varying row widths for some certain reasons.

Can anyone give me a clue? Any hint is appreciated.

P/S: The code snippets are just for illustrative purpose. My code is a bit different.

Count the total length of all strings on the host, allocate one contiguous block of memory on the device, and set up the pointers to it manually.

I didn’t think of this, kind of wondering whether there could be a “formal way”. Many thanks for the valuable advice.