For my application I need to pass a very large float array to the device. I want to know what is the maximum number of float elements that I can copy?
I know that currently the size of the parameter list of a kernel is 256bytes.
unsigned int array_size = <very_large_number>; float *h_array1 = (float*)malloc(sizeof(float)*array_size); //Initialize the h_array1 here .. .. .. float *d_array1; CUDA_SAFE_CALL(cudaMemcpy((void**)&d_array1, &h_array1, array_size, cudaMemcpyHostToDevice)); //Execute kernel etc. .. ..
What can be the maximum value of array_size above?
And, secondly is there a way to determine this number at runtime depending upon the type of card and its DRAM size?
Any help on this would be appreciated