Automatic warp aggregation by PGI Fortran compiler

rahuliitk · September 30, 2020, 12:58pm

Hi,
I have learned from a couple of articles that NVCC compiler is able to perform warp aggregation for atomic operations (e.g., CUDA Pro Tip: Optimized Filtering with Warp-Aggregated Atomics | NVIDIA Technical Blog). Does PGI Fortran compiler also have similar capabilities?

MatColgrove · September 30, 2020, 5:18pm

You should be able to replicate this in CUDA Fortran using Cooperative Groups. See: CUDA Fortran Programming Guide Version 22.7 for ARM, OpenPower, x86

rahuliitk · September 30, 2020, 5:29pm

I meant to ask if the PGI Fortran compiler can do it for me automatically. There is a note at the very top of the link I posted which says: “The NVCC compiler now performs warp aggregation for atomics automatically in many cases, …” . My questions is if the PGI Fortran compiler is also able to do the same.

bleback · September 30, 2020, 10:58pm

I am pretty sure CUDA Fortran does not do this.

Topic		Replies	Views
CUDA Pro Tip: Optimized Filtering with Warp-Aggregated Atomics Technical Blog	8	915	May 29, 2021
Survey for PGI FORTRAN compiler ï¼Thanks~ CUDA Programming and Performance	7	12562	July 27, 2010
difference between cuda fortran & PGI accelerator Legacy PGI Compilers	5	16517	March 19, 2012
warp aggregated atomics result CUDA Programming and Performance	2	832	December 8, 2017
Fortran support Looks like it is actually on the way, via an NVIDIA PGI partnership CUDA Programming and Performance	4	7311	July 6, 2009
Warp-Aggregate AtomicAdd CUDA Programming and Performance	3	2128	March 10, 2015
PGI Community Edition 19.10 Now Available Technical Blog	0	370	August 21, 2022
warps in openacc tile for Fortran Legacy PGI Compilers	2	2559	October 31, 2017
CUDA Fortran and PGI Accelerator mix Legacy PGI Compilers	8	6199	May 20, 2011
atomicadd for double precision in CUDA Fortran Legacy PGI Compilers	20	21776	November 15, 2013

Automatic warp aggregation by PGI Fortran compiler

Related topics