context/streams in profiler

Reading the visual profiler document [1], there are some questions unanswered. I have run a program with 16 threads (cpu utilization is about 1600 for one process) and nvidia-smi shows 1 gpu process.

In visual profiler, I see

Context 1 (compute and streams)
Context 2 (compute and streams)

First context is nearly nothing! while the second context has some data. When I open streams, I see stream 30 ~ Stream 60.

How can I understand such numbers? Why there are two contexts? why there isn’t Stream 29?!

[1] http://docs.nvidia.com/cuda/profiler-users-guide/index.html

It’s entirely possible for a CUDA program to create 2 contexts on a GPU.

There is no pattern to numbering of streams. You should not assume they will start at zero and be numbered sequentially.

As you can see in the picture, the stream 31 is actually the top compute. Does that mean all streams are also mentioned in the compute?

Moreover, What are the percents in the compute? Percent of invocations of a compute? That seems to be incorrect though when I look to the numbers!

External Image
External Image