Hi there,
I have four arrays:
x_max
x_min
y_max
y_min
Each ones first element is holding a integer value:
x_max[0]
x_min[0]
y_max[0]
y_min[0]
Now I want to call an cudamalloc like this:
cudaMalloc((void**) &picture, (x_max[0]-x_min[0])*(y_max[0]-y_min[0])*sizeof(float));
but that wont do, because I cannot access device memory from host like this.
How would you guys do that?
Thanks in advance for tips!
Julian
EDIT: Copy the four values to host like this:
cudaMemcpy( h_out_x_max, d_max_x, sizeof(float), cudaMemcpyDeviceToHost );
cudaMemcpy( h_out_x_min, d_min_x, sizeof(float), cudaMemcpyDeviceToHost );
cudaMemcpy( h_out_y_max, d_max_y, sizeof(float), cudaMemcpyDeviceToHost );
cudaMemcpy( h_out_y_min, d_min_y, sizeof(float), cudaMemcpyDeviceToHost );
I would like to circumvent because it takes 59ns for the four floats!!! Thats too much for my opinions.