Hi there,
I faced a problem with cudaBoundaryModeTrap and surface-writes when changing from toolkit edition 4.2 to 5.0:
I run a programming that always writes the same data to a surface. This has worked perfectly before. But since the update my program crashes in standard vs10 debug mode with an undefined error in the kernel that writes to the surface.
Usually I would understand - as it is described in the programming guide - that the kernel run will lead to an undefined error if the coordinates at whom I access the surface are out of bounds and I chose cudaBoundaryModeTrap (as set by default).
Conform to that my kernel run succeeds when I change the boudary mode to *Zero or *Clamp. (I’m still running standard vs10 debugger)
Further I could now assume that if I add an if condition that is entered when the coordinates are out of bounds the NSight debugger has to step into that scope if I start the program with NSight instead of the standard vs10 debugger.
To say this in code:
if (nX < 0 || nX >= sizeVolume[0] ||
nY < 0 || nY >= sizeVolume[1] ||
nZ < 0 || nZ >= sizeVolume[2])
{
surf3Dwrite(make_uchar2(127,255), // put a breakpoint for nsight here
mySurfRef,
sizeof(uchar2)*1, 1, 1, // choose a coordinate that MUST be correct
cudaBoundaryModeTrap);
}
else
{
surf3Dwrite(make_uchar2(127,255),
mySurfRef,
sizeof(uchar2)*nX, nY, nZ,
cudaBoundaryModeTrap);
}
But NSight does not enter the breakpoint as expected. Not even once.
Could a surface write also fail with cudaBoundaryModeTrap if the pixel or alpha value exceeds [0,255]? (I assume no, cause I already checked that with another if-clause)
I really am confused now because my surf3Dwrite calls all work properly when simply starting with NSight-Debugger while the kernel fails with an “undefined error” when using the standard vs10 debugger.