cudaMemcpy seg fault Segmentation fault copying array

Hi,

I am a newbie to CUDA C/C++. I am trying to copy array of strings to device and copy it back host.I am using following code.I am getting segmentation fault and I am not able to decipher it.Here is the code that I am trying.

char *newdict[dict.size()];   /* dict is a conventional string vector containing words from a file.

                char *device_dict[dict.size()];

for(int i=0;i<dict.size();i++)

                {

                        cudaMalloc((void**)&device_dict[i],strlen(str[i])*sizeof(char));

                }

                cout<<"Device memory allocated-->"<<endl;

                for(int i=0;i<dict.size();i++)

                {

                        cudaMemcpy(device_dict[i],str[i],strlen(str[i])*sizeof(char),cudaMemcpyHostToDevice);

                }

                cout<<"Copied to Device!!"<<endl;

                //for(int i=0;i<dict.size();i++)

        //      {

cudaMemcpy(newdict[i],device_dict[i],strlen(str[i])*sizeof(char),cudaMemcpyDeviceToHost);

        //      }

                cout<<"Copied back to Host!!"<<endl;

                cout<<endl;

                cout<<newdict[1]<<newdict[2];

for(int i=0;i<dict.size();i++)

                 free(str[i]);

                        free(newdict[i]);

                        cudaFree(device_dict[i]);

                }

                cout<<endl<<"Freed"<<endl;

I am not sure where I am going wrong.I think applying for loop to cudaMemcpy is causing the problem but I just need an expert opinion as to why this shouldnt be done and what is the alternate way to such transfers from host to device and vice versa.

Any help will be appreciated!

Thanks and Advance.

Hmmm, I’m still new in CUDA too. But after seeing your code at the first glance, there is something really wrong about it.
You tried to cudaMalloc inside a loop (for). That’s tremendously risky.
OK, so device_dict is an array of char, located in GPU. Its counterpart is newdict, stay in CPU.
However, inside your loop you tried to cudaMalloc over and over again in every single element of device_dict.
REmember! As the argument of the cudaMalloc, you used the POINTER of each ELEMENT (char) of it. And you allocate INCREASING amount of strlen of str[i]. As i increases towards dict_size, strlen also increases. This represents the length of the string, right?
But what you did, is you tried to allocate these STRING (which indeed contains many of char) into a single element of the array (in which that single element can only obtain 1 char). External Image

That’s what I think about the error… Keep debugging bro…
External Image

darwinxie:
There is nothing risky about a cudaMalloc() inside a loop (inefficient, yes. But not risky). And representing strings as arrays of char is basic C use, I’d recommend to consult a text book on C in case of any doubts.

abhinole:
I’ve already posted a coment in your other thread.