Since now malloc() for kernel has been released (I found no literature declaring it only works for 2.x devices) and it seems Tesla C1060 (device 1.3) has a MMU (see this page), I wonder if there is any possibility for C1060 to dynamically allocate memory in kernel or device functions, although I didn’t find any literature supporting this, yet.
I have tried malloc(), when compiling by nvcc with no options, it gave error:
calling a host function from a device/global function is not allowed
When compiling using option -arch=sm_20 or -arch=sm_21, no error occurred. However, the running output was wrong, which is predictable since the device is 1.3 not 2.x.
So it seems malloc() is only available for device 2.x (although I see no literature saying this). Is it because of hardware difference between devices 1.3 and 2.x? But someone told me there is no real difference, so dynamic allocation should be possible on 1.3 devices. Is he correct?
Just in case, below is my system settings.
OS: Ubuntu 10.04 (64-bit)
CUDA device: Tesla C1060 (x2)