Streams in nsight report

mahmood.nt · September 14, 2020, 2:49pm

Hi,
I want to know more details about the concept of “streams” in the output of a trace run. As you can see in the picture, the report contains two streams. In each stream, I see some kernels and memory percentages.

https://pasteboard.co/Jr3wsFi.jpg

So, specifically I want to know how nsight or cuda driver decides to put kernel X in stream 1 or stream 2?
Is there any difference in the characteristics of kernels in these two streams?

jasoncohen · October 7, 2020, 5:26pm

Hi Mahmood,

Streams are part of the CUDA API, intended to help get more concurrency. In your code, you can create streams, and launch kernels, memcpys, and memsets into those streams. A stream is like a FIFO queue of work for the GPU. A stream guarantees that each operation launched into it completes before the next operation starts. Work in separate streams may execute concurrently if the hardware has resources available to do so. For example, a GPU with two Copy Engines is capable of executing multiple kernels, a host-to-device memcpy, and a device-to-host memcpy all at the same time, as long as those operations are all in different streams.

See here for more info. particularly the section about streams:

The trace tools then display which stream each workload executed on. Since streams are serialized sequences of work, it makes sense on a timeline to display them as individual rows.

Topic		Replies	Views
cuda stream CUDA Programming and Performance	3	5801	April 6, 2011
confusions about CUDA streams CUDA Programming and Performance	5	805	July 30, 2017
Cannot get any stream parallelism. CUDA Programming and Performance	13	1280	December 31, 2019
Streams in different compute capabilities CUDA Programming and Performance	0	3428	June 13, 2010
Thread vs Stream what is the difference? CUDA Programming and Performance	6	4888	December 2, 2010
Kernel Functions Blocking Multithreaded Application? CUDA Programming and Performance	11	1101	October 12, 2021
How to analyse the various events which are taking place at GPU using CUDA CUDA Programming and Performance	3	727	July 15, 2015
How to implement calc. pipeline with streams ? CUDA Programming and Performance	0	861	October 14, 2008
Do kernels/streams execute concurrently? CUDA Programming and Performance	1	1177	October 15, 2008
the introduction of stream in CUDA help~ CUDA Programming and Performance	2	7228	November 25, 2008

Streams in nsight report

Related topics