The size of pointers on the device (at least, when passed as parameters) is 8 bytes, to be compatible with 64 bit and future hardware. If you want to use the driver API to pass parameters you need to take this into account and assume pointer arguments are 8 bytes.
(and thus advance the pointer passed to cuSetParami with 8 after a pointer arg)
I’ve had this same problem for over a month now, passing pointers as 8 bytes (64 bits) doesn’t even remotely work, similarly 32bit pointers only work some of the time…
I have ~20-30 unique kernel functions at present (this will only keep growing), most of which reference at least one pointer - generally 2-3, ‘most’ of them work with 32bit pointers, NONE of them work with 64bit pointers, and there are some which work with 32bit pointers ‘only’ if the order of my parameters ensures pointers are at the beginning, and other variables are at the end (eg: foo(int *blah, int *blah2, uint4 blah4))
This is a very ‘very’ frustrating problem, and has caused me delays in the matter of weeks… (this error occurs both on Linux AND Windows).
Post a repro case if you think it’s broken. I’ve never seen it break and I bet you’re confused as to some of the rules (which is mostly our fault–the section in 2.0 on how to pass arguments with the driver API is terrible and has been cleaned up substantially for 2.1, I think). If you can show me what you’re doing I can try to clarify things.
Hmm you’re correct (of course), the preconception that pointers were 32bit, the fact the programming guide does not specifically say they’re 64bit, and the fact that in ‘some’ cases using 32bit pointers worked (‘why’ this works is still beyond me) lead me into all the problems I’m having now.
Previously, attempting to pass 32bit pointers like so would result in errors in some kernels (yet still work in ‘most’ of mine? why?):
Now, passing pointers as 64bit pointers like so seems to be working for all the kernels I’ve ported so far:
[codebox]cuParamSeti(kernel, offset, 0);
cuParamSeti(kernel_func, offset + 4, device_ptr);
offset += 8;[/codebox]
It’s good to hear the programming guide has been revised for 2.1 - I’m still curious why most of my existing kernels work using my first code example to pass pointer values… (all of which have pointers either at the beginning or end of the parameter list)
Edit: Fixed missing parameter offset
Second Edit: It seems to work agnostic of if I put the 32bit pointer on the lower or higher half of the 64bit pointer address…?
I don’t know why it would work whether it’s in the upper or lower half of the eight-byte region–that’s a bit odd, but you shouldn’t be doing that anyway…
Basically, the CUdeviceptr rules are identical to pointer size alignment on the platform you’re on. On a 32-bit platform, that means the offset after using a CUdeviceptr is 4 (basically the effective sizeof(CUdeviceptr)), and they have to be four-byte aligned. On a 64-bit platform, the offset is 8 and they have to be eight-byte aligned. That is true even though sizeof(CUdeviceptr) is 4.
which is the only semi-portable code I could come up with on both 32- and 64-bit platforms. This should handle alignment too. It would be nice to have an official method supported by NVIDIA along with the rules for parameter passing. I too have had problems on 64-bit Linux and solved them by juggling parameters. This kind of experimental approach is not very productive.
I am having the same problem on 64-bit Linux (I have not tested on 64-bit Windows but likely it’s a 64-bit and not a Linux specific issue).
I get nowhere passing CUdeviceptr as an 8-byte value (or using an 8-byte offset), but I am able to make my kernels work by passing them as 4-byte values only when they are at beginning of the argument list. If I insert any intervening 4-byte values like floats or ints, I get garbage output in my arrays.
Obviously, there is some 64-bit issue with parameter passing in CUDA 2.0 on 64-bit platforms. It looks like NVIDIA is having a look and you can search for my recent posts to find the thread.
If you’ve figure out some workaround, please let me know as this is a show stopper for me.
Hi - somewhat late in the day to answer this but it may help somebody out there…
Never use SetParami to pass pointers. Setparami always uses 32-bit representation for both host and device. The host may be 64-bit and/or the device may be 64-bit (FERMI devices compiled with nvcc in 64bit mode)
Use SetParamv to pass pointers and make sure you pass the right size and increment your stack size (offset) by the current size of (void*).
ALSO, be very careful that you ALIGN your variable to the right amount for (void*) - which on 64-bit systems will be an 8byte boundary. Yes, this does mean you may leave some bytes of unused stuff in your parameter stack.