Behavior of cudaGraphInstantiateFlagUseNodePriority

You seem to have ignored the advice given to you by striker159 here. When I change one of your requested priorities to -1, I get that reflected in the output.

Regarding the rest, I’m not really sure what your expectations are. Your kernels are clearly capable of running concurrently. You haven’t established any dependencies, so my expectation is that those kernels would run concurrently. In any event CUDA doesn’t guarantee any execution order of such kernels, with or without priority. They can all execute, so I’m not sure what you are expecting from the priority statements.

I haven’t studied the topic closely, but my expectation here would basically be an analog of CUDA stream priorities. If I launch 3 kernels with 1 threadblock each, stream priorities won’t prevent or order their execution in any way.

If your desired goal is that node priority causes the higher priority to node to execute to completion before the lower priority node begins, you are mistaken, that is not what node priority does, and your test case proves it. You may want to study CUDA stream priorities. There are numerous questions about it on these forums, as well as a section on it in the programming guide.