CUDA limit of 256 byte for parameter passing

Hi.
In CC 1.x, the parameter passing from a host code to a global function is limited to 256 bytes. So, in Fortran global subroutine, I will get a compile error if the subroutine has too many arguments. The question is, if the device data is defined in the module, not passing via argument, would it be counted in this limitation?
Tuan

Hi Tuan,

The question is, if the device data is defined in the module, not passing via argument, would it be counted in this limitation?

No. Using device module data is the recommended way of working around CUDA’s 256 byte limit. This was one of the main reasons our engineers worked very hard to get support for allocatable arrays in device module data into the 10.4 release.

  • Mat