Error "CUDA_EXCEPTION_1" when using memcheck

Hi all.

First post here, I’m hoping someone can enlighten me on this error. I have a “CUDA_EXCEPTION_1” error from a piece of code carrying out an interpolation, it’s not always fired, so assume a race condition type fault, but have been having problems tracking this down further. Apologies if the code is hard to read

__device__ float _interpolate( float x, float y, float *img, int width, int height)

{

  int xt = (int) x; 

  int yt = (int) y; // Top left corner

  float ax = x - xt; 

  float ay = y - yt; // Fractional offsets

  float *ptr = img + (width*yt) + xt; // Pointer to top left corner of image, a 1D array with step of "width"

return ( (1-ax)*(1-ay)* *ptr+ax *(1-ay)* *(ptr+1) + 

      (1-ax)*ay* *(ptr+(width)) + 

      ax*ay* *(ptr+(width)+1) );

}

Assertions and error checks has been removed for readability, but all checks on variables come back with correct values (debugger also confirms this). Code does work if converted to serial C function. Debug output, with memcheck on, gives this:

Program received signal CUDA_EXCEPTION_1, Lane Illegal Address.

[Switching focus to CUDA kernel 34, grid 35, block (12,0,0), thread (0,0,0), device 0, sm 3, warp 0, lane 0]

0x00000000009f6948 in _interpolate (x=warning: Variable is not live at this point. Returning garbage value.

1.79366203e-42, y=warning: Variable is not live at this point. Returning garbage value.

1.79366203e-42, img=warning: Variable is not live at this point. Returning garbage value.

0xfffb3800000500, width=320, height=warning: Variable is not live at this point. Returning garbage value.

  1. at cudaTrackFeatures.cu:172

172 return ( (1-ax)(1-ay) *ptr+ax (1-ay) *(ptr+1)+

The “dead” variables are, I assume, marked as such because the compiler has purged them before the return statement (and breaking before this point shows the variables do have the expected values), it’s the “CUDA_EXCEPTION_1, Lane Illegal Address” error that I am having trouble tracing.

Has anyone come across this error in similar code? Any help would be greatly appreciated

Btw, system specs are Tesla C2070, Ubuntu 10.04 64bit, and CUDA 4.0

Thanks, Pete

This is usually caused by overwriting a pointer. I would check your indexing (especially the when it comes to corner cases). Also you can comment out certain terms of the interpolation while leaving others active in order to determine which term is causing the problem.

Thanks for the quick reply. Yeah, I had thought it could be an indexing problem, and the code checks are in place to ensure that it’s not reading past the end of the data (no writes are made apart from local variables), but I deleted them from the code snippet to keep it brief.

Commenting out certain terms was something that I hadn’t done yet. Doing this the error can fire at any point at which the pointer is used to access memory. I have tried switching to array notation (just grasping at straws) but the same error fires.

You are right about the OOB error though, as when run through cuda-memcheck, output is:

========= Invalid global read of size 4

========= at 0x00003398 in cudaTrackFeatures.cu:914:_cudaTrackFeatures_kernel

========= by thread (5,0,0) in block (6,0,0)

========= Address 0x20091f304 is out of bounds

=========

========= ERROR SUMMARY: 1 error

when I track this down it does also point to the return statement:

return ( (1-ax)*(1-ay) * *ptr + ax*(1-ay) * *(ptr+1) + (1-ax)*ay * *(ptr+(ncols)) + ax*ay * *(ptr+(ncols)+1) );

Have you checked that [font=“Courier New”]ax[/font] and [font=“Courier New”]ay[/font] are always between 0 and 1? The C language does not guarantee this for your example as the cast from float to int is not sufficiently specified.

It seems perfectly well defined unless it results in a value not presentable as an int. The C99 standard says:

6.3.1.4 Real floating and integer

1 When a finite value of real floating type is converted to an integer type other than _Bool,

  the fractional part is discarded (i.e., the value is truncated toward zero). If the value of

  the integral part cannot be represented by the integer type, the behavior is undefined.50)

In any event it doesn’t look like variables ax and ay enter into any addressing expressions. I am not familiar with “CUDA_EXCEPTION_1, Lane Illegal Address”, but I assume it is some sort of out-of-bounds access, meaning one would want to check the value of ptr, and the addressing expressions based on width and height. For example, is it guaranteed that (ptr+(width)+1) points to a location inside the image? Why is height passed to the function, it doesn’t appear to be used?

AFAIR the clarification of the rounding mode was only added in C99, which isn’t part of the C++ specification. I might be wrong though in that somewhere there is a similar clarification for C++ (I’m not a compiler guy). Anyway of course it seems unlikely that compilers implement different rounding modes for the same cast in C99 and C++ just to confuse people.

You’re correct that ax and ay are never used for array addressing, they are just coefficients in the interpolation. Height is used as a variable check in the full function, I removed a lot of the assertions and variable checks to keep the code snippet to a more manageable size. I have checks to ensure that ptr can’t reference further along the memory space than is available.

I’ll run variable checks again in the morning, as it has to be an OOB error of some kind, they are the only verbose errors that I have to go on. Can’t help feeling that I’m missing something with the race condition though, as the program is entirely deterministic the error should fire at every run, but it is still intermittent.

I’ll post an update as and when I make progress

Ta

Following up on the issue raised by tera (possible specification differences between C and C++ with regard to float->int conversion), the C++ standard does appear to be consistent with C99:

4.9 Floating-integral conversions [conv.fpint]
An rvalue of a floating point type can be converted to an rvalue of an integer type. The conversion trun-
cates; that is, the fractional part is discarded. The behavior is undefined if the truncated value cannot be
represented in the destination type. [Note: If the destination type is [font=“Courier New”]bool[/font], see 4.12. ]

I stand corrected. Thanks Norbert for following this up.