max number of arguments for kernel function

I have a CUDA Fortran kernel function with 41 arguments.(I should pack some of the arguments)
The compiler doesn’t give any warning message but the kernel has some error seemingly related to number of arguments.
Is there a limit of arguments? I didn’t find one in the programming guide.

The compiler really should give an error message; there really is a limit on the size of the argument list. This is an NVIDIA CUDA limitation, not a limit imposed by CUDA Fortran vs. CUDA C. The NVIDIA argument list limit is 256 bytes. A four-byte integer takes four bytes, a pointer takes 8 bytes in the 64-bit comiler, 4 bytes in the 32-bit compiler. Real takes 4 bytes, double precision takes 8. If the function has assumed-shape array arguments, then the compiler passes a descriptor with the array, and the descriptor is rather large; (8+6dim)+intsize + 2pointersize bytes, where dim is the number of dimensions.

We will file a request that the compiler check the argument limit and issue a meaningful error message if it is exceeded. Thank you for your comment.

May I ask whether this limit is still the same?

I’ve ran into an issue using 13.2 with compute capability 2.0 where the behavior of the kernel simply becomes ‘strange’ once I exceed 256 bytes of arguments. From what I can find using cuda memcheck, it will corrupt some arguments, especially the slice sizes I’m passing in - thus the arrays don’t get initialized correctly and I’m getting invalid read/writes (out of bounds). When compacting the argument list (but not those integers used for the array bounds obviously), this problem goes away.

It would be nice if we got at least a compiler warning if such a limit is exceeded.