cudaMemPrefetchAsync() for multi-dimentional array

Dear All,

Is it possible to pass a multi-dimentional array to the first argument in the function cudaMemPrefetchAsync() ? At that time, what numeric should I pass to the second argument ?

Here is the test code.

integer(acc_handle_kind) :: stream
real(kind=8),dimension(:,:),allocatable :: A

allocate(A(4096,4096))
A = 1.0d0

stream = acc_get_cuda_stream(acc_async_sync)

!$acc host_data use_device(A)
call cudaMemPrefetchAsync(A,4096*4096, 0, stream)
!$acc end host_data

deallocate(A)


Hi KOUCHI_Hiroyuki,

Yes, so long as your passing in a contiguous slice (which you are since it’s the whole 2D array), then this should be fine. And, yes, you would using “4096*4096” for the count since this is the total number of elements.

-Mat

Dear Mat-san,

Thank you for the reply to me.

Sincerely yours,