Some CUPTI activities have a field of uint32_t channelID that represents, for example:
The ID of the HW channel on which the memory copy is occurring.
The ID of the HW channel on which the kernel is launched.
I didn’t find a clear description of the “channel” concept from either the CUDA programming guide or Nsight System documentation. Can the dev team offer some clarifications?
PS: I have a test program that performs H2D copies using multiple streams on multiple threads. On a 3080 Ti, I noticed that CUPTI reports channel IDs of 8, 9, 10, 11 being used. Are they related to the GPU copy engines in any way? Is a certain CUDA stream always bound to a certain HW channel?
Thank you!
Hi King_Crimson,
A channel in CUDA refers to a work queue used for launching tasks on the GPU. CUDA streams are mapped to one or more of these work queues by the driver. Work queues are hardware resources that manage an ordered sequence of commands within a stream, such as kernel executions or memory transfers, which are executed by specific GPU engines.
While it is possible to create a large number of CUDA streams, the number of available work queues is limited (by default, there are 8). When the number of streams exceeds the number of work queues, multiple streams may be mapped to the same work queue, potentially leading to false dependencies and suboptimal performance.
The number of work queues, also known as connections, can be adjusted between 1 and 32 using the environment variable CUDA_DEVICE_MAX_CONNECTIONS.
1 Like