I’ve been trying to figure out how to control the device while it’s executing. I realize that this isn’t really what GPU is all about, but nevertheless…
The basic idea is to allocate space to a variable in global memory of the device and have a pointer to that variable available to both host and device. When kernel is executed, it loops ( ) on this variable; that is, device loops until host sets variable to a predefined “exit value” ( i know, it’s terrible).
In my code I used cudaMalloHost to allocate memory for the variable, asynchronously launched the kernel using one stream and then asynchronously changed the value of the variable using another stream.
This resulted in an infinite loop because, from what I understand, the variable is read correctly by the device only on the first iteration through the loop. So on subsequent iterations, there is no change in value even though host asynchronously changes it using another stream.
Any ideas what I’m doing wrong? Is there a specific way to allocate memory to this “shared” variable so that the changes made by the host are reflected in the device while it executes?
If my question is still unclear just let me know, and I’ll post my implementation.