Hi, I am using Vulkan timeline semaphores in CUDA for CPU-GPU synchronization.
My code is:
for (size_t i = 0; i < num_layers; i++)
{
waitVulkanSemaphore(i, ioStream1);
cudaEventRecord(events_get[i], ioStream1);
}
for (size_t i = 0; i < num_layers; i++)
{
cudaStreamWaitEvent(computeStream, events_get[i], 0);
mockKernel<<<numBlocks, blockSize, 0, computeStream>>>(d_data, dataSize, 42);
}
for (size_t i = 0; i <= num_layers; i++)
{
// sleep mocks CPU work - can be in different thread
sleep_ms(600);
signalVulkanSemaphore(i);
}
The problem is that if I call waitVulkanSemaphore (calls to cudaWaitExternalSemaphoresAsync ), the call to mockKernel blocks the CPU. Then i don’t get to signalVulkanSemaphore, and i get a deadlock. Why the call to cudaWaitExternalSemaphoresAsync makes the mockKernel to be synchronous and not asynchronous? The full code of is attached.
Thanks very much.
vulkanCode.txt (11.9 KB)