Is there any mechanism to suspend and resume a CUDA stream

Hello forum,

As we all know, a CUDA stream consists of a series of CUDA operations.
I want to submit the operation to a CUDA stream but I don’t want it to be submitted to the GPU immediately.

I wonder is there any mechanism that:
I can suspend a CUDA stream to which I can still submit operations. The operations in the suspended stream are not submitted to the GPU unless I manually resume the stream

There is no such function. What would be your usecase?

I’d like to pause a stream to stop executing operations of it. During the pausing, there’re several operations will be submitted to the stream. I hope those newly added operations will not be discarded and executed after resuming the stream

Okay, but why do you want to interupt a stream in the first place? Do you need to wait for other things to finish ?
I mean, if you do not want some operations to be executed yet, just don’t submit them to the stream. Why make it more complex?

Sorry for the late reply.

There’re two branches that contain different series CUDA operations. The option of the branches depends on the running CUDA kernels. Once a CUDA kernel returns results, then I can launch a branch of operations along with running kernels simultaneously.

The first option I see would be to just synchronize the stream, then launch branch A or branch B

bool h_result;
bool* d_result;
kernel<<<stream>>>(d_result);
cudaMemcpyAsync(&h_result, d_result, 1, cudaMemcpyDeviceToHost, stream);
cudaStreamSynchronize(stream);

if(h_result){
   kernelA<<<stream>>>();
}else{
   kernelB<<<stream>>>();
}

Alternatively, you could move the condition into kernelA and kernelB, and launch both of them, using events to introduce dependencies.


__global__ void kernelA(const bool* d_result){
  if(*d_result) ; //do kernelA stuff 
  else ; // do nothing 
}

__global__ void kernelB(const bool* d_result){
  if(!*d_result) ; //do kernelB stuff 
  else ; // do nothing 
}

cudaEvent_t event;
cudaStream_t streamA, streamB;
bool* d_result;
kernel<<<stream>>>(d_result);
cudaEventRecord(event, stream);

cudaStreamWaitEvent(streamA, event, 0);
kernelA<<<streamA>>>(d_result);

cudaStreamWaitEvent(streamB, event, 0);
kernelB<<<streamB>>>(d_result);

//join the streams again if necessary
cudaEventRecord(event, streamA);
cudaStreamWaitEvent(stream, event, 0);
cudaEventRecord(event, streamB);
cudaStreamWaitEvent(stream, event, 0);
2 Likes

The second method is what I need! Thank you very much for your solutions!!! :)

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.