I have a real-time system, for which I am considering a GPU solution. It consists of a camera frame grabber, a Tesla C1060, and an IO output board. Ideally, there would be no CPU intervention except to start or stop operations.
Operation is envisioned to be as follows (detailed interlocking ignored here except for the GPU):
The frame grabber DMAs a new frame of data into the GPU’s input buffer in global memory. In addition to the camera data, the frame contains a control word, which the GPU will decode.
The GPU will loop in its kernel, periodically looking at the control word in global memory.
When it sees the word, it will do whatever is appropriate: exit if it is “abort”; process the data if it is “buffer ready”; etc.
When it has decoded the control word, it will zero it and start processing or aborting.
When the GPU has finished processing, it will place the processed data into its output buffer, along with a control word: such as “output ready” or “error”, etc.
The GPU will now start looping, checking its input control word for its next action.
The output card will be polling the output control word of the GPU’s memory and when it sees there is data, it will start sending it.
Ideally, the GPU could DMA straight into the output card’s buffer. Can it do that?
Does anyone see any problem with the video card DMA’ing into the Tesla’s global memory?
I’m really trying to keep the CPU out of the mix if possible except for setup and take down of the process. It should be able to run for hours with no intervention.
Thanks in advance,