So I figured out the issue and will post it here in case anyone else is having a similar problem. The fact that the “screwed up” output image was different each time I started the program (but always used same input image) lead me to thinking that this was a pointer issue. The CUDA.NET example code I used as a starting point used code very similar to the following code to send the input values to the graphics card:
[codebox]
CUdeviceptr d_idata = cuda.CopyHostToDevice(h_idata);
CUdeviceptr d_odata = cuda.Allocate(h_idata);
CUdeviceptr d_shift = cuda.CopyHostToDevice(labShift);
cuda.SetFunctionBlockShape(function, BLOCK_DIM, BLOCK_DIM, 1);
cuda.SetParameter(function, 0, (uint)d_odata.Pointer);
cuda.SetParameter(function, IntPtr.Size, (uint)d_idata.Pointer);
cuda.SetParameter(function, IntPtr.Size * 2, (uint)d_shift.Pointer);
cuda.SetParameter(function, IntPtr.Size * 3, (uint)size_x);
cuda.SetParameter(function, IntPtr.Size * 3 + 4, (uint)size_y);
cuda.SetParameterSize(function, (uint)(IntPtr.Size * 3 + 8));[/codebox]
The reason they claimed to use the IntPtr.Size as the parameter to specify the size of the input parameter was that those parameters were pointers and that the IntPtr.Size is dynamic with the operating system (meaning it return 4 in a 32-bit system and 8 in a 64-bit system). The .Pointer property on the CUdeviceptr class returns a uint though which is always 4 bytes so the data type lengths were correct in 32-bit, but wrong in 64-bit which messed up the input parameter pointers. I fixed it with the following changes:
[codebox]
cuda.SetFunctionBlockShape(function, BLOCK_DIM, BLOCK_DIM, 1);
cuda.SetParameter(function, 0, (uint)d_odata.Pointer);
cuda.SetParameter(function, 4, (uint)d_idata.Pointer);
cuda.SetParameter(function, 8, (uint)d_shift.Pointer);
cuda.SetParameter(function, 12, (uint)size_x);
cuda.SetParameter(function, 16, (uint)size_y);
cuda.SetParameterSize(function, (uint)(20));[/codebox]
So hard coding the starting positions fixed the problem and now my pointers are correct in 32-bit and 64-bit. Hope this helps someone else in the future.