problem with "vector sum" example in "CUDA by example" book


I am trying to learn CUDA and reading one of the recommended book “CUDA by Example” by Sanders and Kandrot. In chapter 5, there is a vector addition example. Basically, the program takes in two vectors of certain length and finds the summation of the two.

The example by itself works fine but if I increase the number of elements in the vector, the program crashes.

The following works fine:

#define NUMBER_OF_ELEMENT (64*1024)

The following will crash:

#define NUMBER_OF_ELEMENT (128*1024)

I am also attaching the complete code.

I am trying the program on a XPS15 which has 525m 1GB memory. I think it is not limited by the memory size. Also, the example is designed to work with large vector size. So, what is wrong?

Thanks a lot for your time! (1.43 KB)

The sample program allocates the host arrays on the stack (instead of the heap), so perhaps the crash is the result of inadequate stack space. Try replacing this:


with something like this:

int* a = new int[NUMBER_OF_ELEMENT];

int* b = new int[NUMBER_OF_ELEMENT];

int* c = new int[NUMBER_OF_ELEMENT];

(rest of the code)

delete [] a;

delete [] b;

delete [] c;