I’m guessing the PCI card would have to ensure that its data is formatted correctly for CUDA. Or maybe raw sensor data can be processed directly.
No discussion on what the latency of the translation of user to kernel space etc. I’ll bet moving memory is something linux is very good at and maybe the benefit less than an order of magnitude.