Is copying an array of character strings to device memory absolutely impossible?

ive been battling this issue for over 3 days now.
I have an an array of charater strings defined as:

char *a[3];
a[0]=“foo1”;
a[1]=“foo1”;
a[2]=“foo2”;

I need to copy this to device. Can somebody tell me how?

Heres wat i tried so far:

  1. cudaMalloc((void**)&a[0],5sizeof(char));
    cudaMemcpy(dev_array[0],a[0],5
    sizeof(char),cudaMemcpyHostTo
    Device);
    //Subsequently do this for a[1], a[2]…n so on.

This works, but as you see, ive had to explicitly send in each character string one at a time. If i have char *a[1000], this is obviously impossible to do. Also calling the kernel with this big a number of pointers is impossible. Is there a way to do this???

Somebody plzz reply to this. For some odd reason noone replies to my posts ever…am i that big a noob? Plz help!

You might get responses if you post in a more appropriate forum, like the programming and development forum.

But the short answer to your question is, yes, that is roughly how the memcpy will have to work. If you have an array of strings, you will have to copy each string over to the GPU individually. An alternative would be to use the C++ “STL like” container classes from the Thrust library, which will hide most of that tedious memory management in a neat template class. Another might be to compact the array of strings into a stream yourself on the host CPU, copy it to the GPU and then unpack the stream back into something your algorithm can work with using a kernel on the GPU side.

Oh im sorry…dint know this forum did not deal with programming issues. What do you mean by stream here? The thing is, i don mind sending in the strings individually…thats totally possible. using a for loop. But how do i call my kernel then…i definitely cant use the same method for the kernel invocation right?

I think it is possible to use only one memcpy call. You just need to use the proper size. Instead of 5sizeof(char), it should be whatever many chars as you have for all the strings together, since in memory this is linear anyway. In your example it would be something like 53*sizeof(char). Just do something like this:

int n = 3; // this is whatever you need it to be
char a[n];
// fill in the arrays somehow
// then:
int numchars = 0;
for(int i = 0;i < n;i++) {
numchars += strlen(a[i]) + 1; // +1 is for the string termination character didn’t test it
}
size_t numBytes = numchars * sizeof(char);
cudaMalloc((void
*)&a[0],numBytes);
cudaMemcpy(dev_array[0],a[0],numBytes,cudaMemcpyHostTo
Device);