CUDA callable from Linux driver/kernel context? CUDA callable by Linux drivers?

Is CUDA available on any real-time Unix/Linux? If not, which routines are currently callable from a device driver kernel context under supported Linux? We would need only the basic routines such as device memory malloc/free/copy, device get/set as related to multiple GPU management and, of course, device kernel invocation). To meet our real-time requirement on non-real-time Linux, we recast our application as a Linux device driver that runs only in response to a hardware signal of incoming data.
We’d greatly appreciate your reply.

CUDA has only been tested & qualified with the Linux distributions listed on the CUDA download page. None are realtime distributions.

Many people have succeeded in running CUDA with distributions other than the ones listed. So I guess this is directly related to whether you are able to run the NVIDIA driver correctly. If you can, check that you can correctly link the CUDA libs and you have full device access and API functionality.


Thanks Peter for the prompt reply.
We have no problem running our own CUDA applications on the supported Linux distributions. In fact, we have developed our own CUBLAS routines that run 2 to 3 times as fast as NVIDIA distribution. We would like to hear if you or your colleagues have written a Linux driver that makes CUDA calls. We have several $$$$$$$ projects currently under development that require real-time computing.

Thanks tygemo for your reply.

Do you know anyone who has written a Linux device driver that processes real-time input (perhaps generated by another PCIe card) by calling CUDA to do the heavy lifting (compute-intensive calculations) ?

I can imagine that it is no big deal to use CUDA in a userland daemon thing, but I am unsure whether it works in kernelland. Maybe you need to do some simple forwarding driver that actually does the work in userland.


I’m curious to hear more about this. (I’m not surprised – we have to make the CUBLAS routines very general and so have to skip some optimizations – but I am curious.)