directly pass in array as kernel input parameter?

This is a newbie question. Following code printed out correct results in emulation mode and had no complains in regular mode. Basically it passed in an array directly as the kernel function’s input parameter.

My question is that if this works, why use cudaMemcpy? If not, what are the limitations? Thanks[codebox]#include

  2

  3 __global__ void kernel (float *a ){

  4

  5 //printf("value a is %f \n",a[1]);

  6 //printf("value b is %f \n",a[0]);

  7

  8 }

  9

 10

 11 int main(){

 12

 13 float a[2];

 14

 15 a[0] = 12 ;

 16 a[1] = 1 ;

 17

 18 kernel<<<1,10>>>(a);

 19

 20

 21 }

~

[/codebox]

This works in emulation mode because everything runs on the host, and emulation is not an accurate hardware simulation. In regular mode, this works because you don’t do anything with the pointer. If you actually tried reading the pointer, you would probably get an “unspecified launch failure” or some other generic error.