I’ve run into some problems dealing with several pointers in a kernel parameter list. It might boil down to my misunderstanding of pointers in CUDA.
Are pointers sometimes/always 64-bit, on both 32-bit and 64-bit hosts?
The matrix multiply driver API example uses cuParamSeti (32-bit integer) and separates pointers by only 4 bytes. However, other posts on this forum say that the separation must be 8 bytes and this matches some of my past experience. In which case, what happens to the other 4 bytes that cuParamSeti doesn’t touch?
Thanks, I switched to the (hidden) flag -m32 and made the pointer separation 4 bytes. My parameter problem - basically when I added a few unused pointers as parameters to a kernel, separated by 8 bytes, CUDA was reading subsequent parameters from an incorrect offset - went away.
I wonder why there’s 64-bit support in there at all. CUdeviceptrs are 32-bit and CuParamSeti is 32-bit (I guess CuParamSetv could work).