CUDA 7.5 array doesn't save a structure.

I have this structure:

struct Information {
    int number;
    char name[10];
    int age;
    double points;
};

And these arrays hold 11 values each.

char inputChar[11][10];
int inputInt[11] = {};
double inputDouble[11] = {};
Information dstArray[20] = {};

What I need to do, is make 11 threads. Each thread has its own number 0-10 range. I need to create Information structure element from i’s elements of the arrays. For example thread with number 4. Would create and Information structure element which has values of

number = 4
name = inputChar[4]
age = inputInt[4]
points = inputDouble[4]

And add that element to the end of dstArray. At the end, I need to print out the array to the file. The purpose is to see how this thing works, how output would differ in different runs. I have a few problems. First off all, this is my main() method.

int main(int argc, char** argv) {
    readFromFile("data.txt");
    printDataFile("results.txt");
    runThreads();
    printDstArray("results.txt");
}

ReadFromFile() method reads data from the text file and writes it to those 4 data arrays. printDataFile() method prints data from the arrays to results text file. runThreads() method should run the threads and shuffle dstArray in different way each time. printDstArray() method prints that shuffled array to the same results.txt file. I have done the same task with OpenMP, so I’m sure that ReadFromFile(), printDataFile() and printDstArray() works fine.

And this is my runThreads() method:

void runThreads() {
    int *intPointer = 0;
    double *dblPointer = 0;
    char *charPointer;
    Information *dstArrayPointer;

    cudaError_t cudaStatus;

    // Allocates the memory 
    cudaMalloc((void**)&charPointer, sizeof(char) * 10 * 11);
    cudaMalloc((void**)&intPointer, sizeof(int) * 11);
    cudaMalloc((void**)&dblPointer, sizeof(double) * 11);
    cudaMalloc((void**)&dstArrayPointer, sizeof(Information) * 20);

    // Copies from CPU to GPU
    cudaMemcpy(charPointer, inputChar, sizeof(char) * 10 * 11, cudaMemcpyHostToDevice);
    cudaMemcpy(intPointer, inputInt, sizeof(int) * 11, cudaMemcpyHostToDevice);
    cudaMemcpy(dblPointer, inputDouble, sizeof(double) * 11, cudaMemcpyHostToDevice);
    cudaMemcpy(dstArrayPointer, dstArray, sizeof(Information) * 20, cudaMemcpyHostToDevice);


    // Threads
    runGPUThreads << <1, 11 >> >(charPointer, intPointer, dblPointer, dstArrayPointer);
    cudaError_t cudaerr = cudaDeviceSynchronize();

    if (cudaerr != CUDA_SUCCESS) {
        printf("kernel launch failed with error \"%s\".\n",
        cudaGetErrorString(cudaerr));
    }

    // Copies dstArray from GPU to CPU
    cudaMemcpy(dstArray, dstArrayPointer, sizeof(Information) * 20, cudaMemcpyDeviceToHost);

    // Frees the memory
    cudaFree(charPointer);
    cudaFree(intPointer);
    cudaFree(dblPointer);
    cudaFree(dstArrayPointer);
}

What it does, allocates memory, copies arrays to GPU, executes runGPUThreads() method for each thread, copies back destination array from GPU and frees the memory.

__global__ void runGPUThreads(char *str, int *integer, double *dbl, Information *dst) {
    int nr = threadIdx.x;
    Information info;
    info.number = nr + 1;
    int j = 0;
    for (int i = nr * 10; j < 10; i++) {
        info.name[j] = str[i];
        j++;
    }
    info.age = integer[nr];
    info.points = dbl[nr];

    for (int i = 0; i < 20; i++) {
        if (dst[i].number < 1) {
            dst[i] = info;
            break;
        }
    }
}

What it does, at first it gets thread’s number, then it creates temporary Information structure object and assigns values from input arrays. At the end, it looks for the first element in destination array that’s number is less than 1 (I think that indicates that element at that index is empty), and then it adds that Information structure to the array at that index and breaks the loop.

The problem is, that each time it assigns that Information struct to an index = 0. I guess, that is because array doesn’t save that value. after method ends.

I’m 100 percent sure that information structure I create in runGPUThreads() method has the right information, because I tried to printf() that structure.

Where can be a problem?