This is what I want to do:
-
declare a device variable:
device float a; -
write a kernel with the last value to be returned
global kernerl()
{
…
a= sometthing;
} -
in the main calling program
//declare a host variable
float b;
.
.
.
kernel<<< …>>> ();
// try to get the variable back to host memory
cudaMemcpy ( b, a, sizeof(float) , cudaMemcpyDeviceToHost ); → this does not work
cudaMemcpyFromSymbol( b , a , sizeof(float), 0, cudaMemcpyDeviceToHost); → this does not work
b = a; → this does not work
}
What else is there?