I am trying to use concurrent streams but I am having problems with it. Maybe someone has already tried to use streams and can give me a piece of code so I can see how it works. First of all I start with:
istat = cudaStreamCreate(1)
this should create a new stream, but still the compiler states:‘Argument number 1 to cudastreamcreate does not match INTENT (OUT)’. Am I already missing something? Am I correct that I then have to create for each stream a data copy? eg.
istat = cudaMemcpyAsync(adev,a,100,0)
istat = cudaMemcpyAsync(adev,a,100,1)
The function cudaStreamCreate expects an integer variable as it’s argument. The function will then assign it a stream id.
% cat teststream.cuf
integer :: strm1, istat, N
real, dimension(:),pinned,allocatable :: A
real, dimension(:),allocatable, device :: Adev
istat = cudaStreamCreate(strm1)
print *, strm1
istat = cudaMemcpyAsync(Adev,A,N,cudaMemcpyHostToDevice,strm1)
end program teststream
% pgf90 teststream.cuf -V10.9 ; a.out
Note that the host array must be and allocatable and located in pinned memory.
Hope this helps,