Hi,
I am working on an asynchronous execution using a subroutine similar to this:
subroutine sub1(stream)
$acc data create () copy() async(stream)
$acc do loop async(stream)
...
$acc end do loop
$acc data end
end subroutine sub1
The subroutine is called several times by a code like this:
do i =1 , num_chunks
streamid= mod(i,2) +1 ! create the ids of two streams: 1,2
call sub1(streamid)
end do
The idea is to create two (or more) streams to have a pipelined execution. My doubt is if the acc data end clause performs implicitly a cudaEventSynchonize call that will not enable the concurrent execution.
I hope you can clarify this.
Regards,
Guillermo