gangs, worker and vector

evanchong · July 29, 2015, 9:40pm

I am a little bit confused about these terminologies ?

Is it possible to compare these ones with Work group, warps and threads ?

Thanks.

MatColgrove · July 29, 2015, 10:21pm

PGI’s current implementation when targeting NVIDIA’s GPUs is to map a “gang” to a CUDA block, “worker” to thread%y, and “vector” to thread%x.

However, if there was a different target such as a multi-core x86 system, the mapping would be very different.

One of the benefits of OpenACC is that it allow you the programmer to abstract away the details of the underlying architecture. This allows you to focus on the parallelism and not how to map parallelism to a particular device, thus giving greater performance portability.

Think of “gang” as course grain parallelism where the gangs work independently of each other and may not synchronize. “vector” is the finest granularity with an individual instruction operating on multiple pieces of data (SIMD/SIMT). “worker” is between the two and allows for grouping of vectors.

You might find this section of the OpenACC Best Practices Guide helpful.

Mat

*Note that the link to the best practices guide from 2015 was no longer valid, so I updated to the new document, May 2024.

Topic		Replies	Views
Mapping between OpenACC and CUDA parallelism levels Legacy PGI Compilers	3	6559	April 16, 2015
Help understanding gang and vector specification Legacy PGI Compilers	1	2400	November 26, 2012
gang,worker and vector in openacc Legacy PGI Compilers	1	1906	April 27, 2012
Questions about 'vector' and 'gang' Legacy PGI Compilers	5	7022	February 10, 2016
OpenACC: Fine tuning accelerator performance nvc, nvc++ and nvfortran	5	1307	March 18, 2021
How to reduce branch divergence? Legacy PGI Compilers	2	5938	September 18, 2015
how gang and vector parallelization of a loop map to the GPU Legacy PGI Compilers	5	8032	February 26, 2014
OpenACC Loop Organization Legacy PGI Compilers	3	2290	February 5, 2016
paralle + independent and kernels + vector_length() Legacy PGI Compilers	5	4033	August 20, 2012
OpenACC parallel loop gang, vector Legacy PGI Compilers	4	6570	December 7, 2023

gangs, worker and vector

Related topics