Question about how GPGPU communicates conceptually

Hello,

I have a little 16 core Nvidia GPU that I can use (it’s in my Mac mini.) What I’m lost on is how communication works between my video card and the rest of the machine. Yes, it’s plugged into my motherboard and that’s how my OS can access it. However, if I upload something to my video card or send any additional information to crunch on, does the processing on the video have to necessarily stop?

Lets say I have an algorithm that tracks and attempts to predict weather patterns. I can have an existing set of data that the video card can work through, but after some time I would like to send additional training information to the card asynchronously so that it can digest that information once it has all arrived. Also, getting back asynchronous replies would be awesome.

If I need to add more details or explain this in a different way, let me know.

I am not sure if there is a limit that prevents a kernel from running non-stop.

Usually code will look like (on cpu side):

while True
  call cuda kernel( parameters )
  
  optionally call another cuda kernel( parameters)

So inside this loop there is plenty of oppertunity to supply new data or to acquire data from the gpu by copieing memory back and forth.