Suspect invalid global read with cuda-memcheck

Bast-51 · February 16, 2016, 12:27pm

Hi everybody,

First, sorry for my English, I am a French cuda user.
I have a invalid read error that I am unable to solve with my code…
Let us consider the following kernel (simplified in order not to be too complicated, but what I find strange happens even with this simplified code):

__global__ void kernel2(int InteriorLower, int InteriorUpperX, int InteriorUpperY, int InputWidth,
        int InputHeight, int OutputWidth, int OutputHeight, const int *Stencil, const float *RhoSamples,
        int SupportSize, int SupportWidth, int ScaleFactor, const float * Input, float * Output,
        int OutputOffset, int SampleOffset) {
    int iy, ix, y, x;
    int bX = blockIdx.x,
        tX = threadIdx.x,
        bY = blockIdx.y,
        tY = threadIdx.y;
    int xx = bX * blockDim.x + tX,
        yy = bY * blockDim.y + tY;


    iy = yy - WINDOWRADIUS;
    ix = xx - WINDOWRADIUS;

    y = ScaleFactor * iy - SampleOffset;
    x = ScaleFactor * ix - SampleOffset;
    i = 3 * (x + OutputWidth * y);

    aux = Output[i];
}

I get no error with this previous kernel using cuda-memcheck.
But actually my code is (the last line differs):

__global__ void kernel2(int InteriorLower, int InteriorUpperX, int InteriorUpperY, int InputWidth,
        int InputHeight, int OutputWidth, int OutputHeight, const int *Stencil, const float *RhoSamples,
        int SupportSize, int SupportWidth, int ScaleFactor, const float * Input, float * Output,
        int OutputOffset, int SampleOffset) {
    int iy, ix, y, x;
    int bX = blockIdx.x,
        tX = threadIdx.x,
        bY = blockIdx.y,
        tY = threadIdx.y;
    int xx = bX * blockDim.x + tX,
        yy = bY * blockDim.y + tY;


    iy = yy - WINDOWRADIUS;
    ix = xx - WINDOWRADIUS;

    y = ScaleFactor * iy - SampleOffset;
    x = ScaleFactor * ix - SampleOffset;
    i = 3 * (x + OutputWidth * y);

    aux = Output[i];
    Output[i] = aux;
}

Here, using cuda-memchek, I get the following error:

========= CUDA-MEMCHECK
========= Invalid  read of size 4
=========     at 0x000003f0 in sinterp.cu:717:kernel2(int, int, int, int, int, int, int, int const *, float const *, int, int, int, float const *, float*, int, int)
=========     by thread (0,8,0) in block (2,0,0)
=========     Address 0x03110dcc is out of bounds
...

The line of error (717) is the line corresponding to:

aux = Output[i];

in the previous kernel.

So, I get no error in the first case at this line, but I get one with the second case… I don’t see why the last line

Output[i] = aux;

would impact the previous one…

Any idea ?

Thank you all in advance.
Bastien.

Robert_Crovella · February 16, 2016, 1:11pm

Compiler optimization.

Without the last line that is writing to global data, the compiler can optimize away (eliminate) the code that is causing the invalid read. Obviously the last line is not causing the invalid read. But if you don’t set any global state in your kernel, your kernel code can be eliminated (it does nothing).

You can confirm that your kernel reduces to almost nothing with cuobjdump -sass

Bast-51 · February 16, 2016, 2:19pm

Thank you for your answer.
Actually I need the second version (because as you said the first one does nothing…), so I have obviously a problem with my index i.

I try to debug it for hours now, but don’t succeed in finding the error…

Thank you again !

Robert_Crovella · February 16, 2016, 8:34pm

You don’t actually have the variable i defined anywhere in your kernel code that I can see. I presume it would be an integer-type variable.

You can use printf in device code. If you know what the valid range for i should be (MIN_I < i < MAX_I), then just add a conditional test like this:

if ((i < MIN_I) || (i > MAX_I)) printf("i out of range, i: %d xx: %d yy: %d x: %d, y: %d\n", i, xx, yy, x, y);
else {
    aux = Output[i];
    Output[i] = aux;}

Topic		Replies	Views
cuda-memcheck error: Address is out of bounds. CUDA Programming and Performance	2	5780	November 12, 2012
Tracking Invalid read size and illegal memory access CUDA Programming and Performance	3	7714	May 24, 2016
how to resolve Invalid __global__ read of size 8 error , I am using 3d array with 3d indexing in ker CUDA Programming and Performance	1	6239	July 10, 2016
CuBLAS GEMM leads to invalid reads for some matrix dimensions GPU-Accelerated Libraries	6	2069	August 27, 2014
Invalid __global__ write of size 4. Need help with debugging CUDA Programming and Performance cuda	3	975	October 12, 2021
Memcheck error, dynamic parallelism and address Compute Sanitizer	4	1108	November 16, 2023
Problem with two seemingly unrelated sections in my code CUDA Programming and Performance	3	504	November 10, 2017
Strange crashes in __device__ function CUDA Programming and Performance	4	1062	August 1, 2016
nan value in array CUDA Programming and Performance	5	6115	April 18, 2016
Invalid memory read on shared memory CUDA Programming and Performance	3	1255	August 29, 2016

Suspect invalid global read with cuda-memcheck

Related topics