There have been lots of flavors of new loads and stores, and we are a little bit behind. Here is some info for you though.
For many releases, we have supported “pipelined” async loads. This is one variation supported by CUDA C. It looks like we didn’t document this unfortunately, but there is an example in the 2nd edition of “CUDA Fortran for Scientists and Engineers”. See section 5.3.2.1.
The interfaces for these are in the WMMA module. There are just a few, and if you need more, let us know. Here is the code for the interfaces we have in there:
! PIPELINE Interfaces
interface pipelineMemcpyAsync
attributes(device) subroutine pipelineMemcpyAsyncR8x2(dst, src) &
bind(C, name=“__nvf_wmma_memcpy_async_r16”)
real(8), device :: dst(2), src(2)
end subroutine
attributes(device) subroutine pipelineMemcpyAsyncR8(dst, src) &
bind(C, name=“__nvf_wmma_memcpy_async_r8”)
real(8), device :: dst, src
end subroutine
attributes(device) subroutine pipelineMemcpyAsyncR4x4(dst, src) &
bind(C, name=“__nvf_wmma_memcpy_async_r16”)
real(4), device :: dst(4), src(4)
end subroutine
attributes(device) subroutine pipelineMemcpyAsyncR4(dst, src) &
bind(C, name=“__nvf_wmma_memcpy_async_r4”)
real(4), device :: dst, src
end subroutine
end interface
interface pipelineCommit
attributes(device) subroutine pipelineCommit() &
bind(C, name=“__nvf_wmma_pipeline_commit”)
end subroutine pipelineCommit
end interface pipelineCommit
interface pipelineWaitPrior
attributes(device) subroutine pipelineWaitPrior(prior) &
bind(C, name=“__nvf_wmma_pipeline_wait_prior”)
integer(8), value :: prior
end subroutine pipelineWaitPrior
end interface pipelineWaitPrior
If you need different data types, you can probably just call the C name for a “size” that matches, and that should work. The assembly for these functions gets inlined at compile time, so they are efficient.
Recently we added support for TMA operations too. That WAS documented. See section 3.6.7 of the current CUDA Fortran Programming Guide.
Mat and I will prioritize getting the updated WMMA documentation into our next release.