Tracking Invalid read size and illegal memory access

senanur · May 24, 2016, 3:05am

Greetings,
currently I’m trying to implement genetic algorithm using CUDA. I use the code below to evaluate each individuals using a CUDA kernel.

__global__ void evaluate(int * population,
                         int * distance, 
                         int * cost,
                         int nTowns,
                         int * d_index)
{
    int sum = 0;
    int t0, t1, idx;

    idx = threadIdx.x + blockIdx.x * blockDim.x;

    for (size_t i = 1; i < nTowns; i++) {
        t0 = idx * nTowns + (i - 1);
        t1 = idx * nTowns + i;
        
        sum = sum + distance[population[t0] * nTowns + population[t1]];
    }
    t0 = idx * nTowns + nTowns - 1;
    t1 = idx * nTowns;
    cost[idx] = sum + distance[population[t0] * nTowns + population[t1]];

    d_index[idx] = threadIdx.x;
}

I occasionally got some errors from this code, like 2-3 times out of 100 runs. Then I tried using cuda-memcheck and I got these outputs:

GPUassert: an illegal memory access was encountered ga_tes_3a.cu 469
========= CUDA-MEMCHECK
========= Program hit cudaErrorIllegalAddress (error 77) due to "an illegal memory access was encountered" on CUDA API call to cudaDeviceSynchronize. 
...

GPUassert: unspecified launch failure ga_tes_3a.cu 469
========= CUDA-MEMCHECK
========= Invalid __global__ read of size 4
=========     at 0x000000e0 in evaluate(int*, int*, int*, int, int*)
=========     by thread (327,0,0) in block (7,0,0)
=========     Address 0x3c45c467c is out of bounds
...
========= Program hit cudaErrorLaunchFailure (error 4) due to "unspecified launch failure" on CUDA API call to cudaDeviceSynchronize.

How can I track this error? Any idea of why is this happened?

I’m sorry if my English is bad.

Robert_Crovella · May 24, 2016, 3:42am

For out-of-bounds addressing errors in kernel code (e.g. your “Invalid global read of size 4”) cuda-memcheck can localize the error to a specific line of your kernel code, if you compile with the -lineinfo switch.

This is an extremely useful debugging technique. A longer writeup/example is here:

http://stackoverflow.com/questions/27277365/unspecified-launch-failure-on-memcpy/27278218#27278218

Once you know the line of code, and the thread/block ID’s:

=========     by thread (327,0,0) in block (7,0,0)

that may enough for you to figure out what is going on, or add some directed printf statements to your kernel code:

if ((threadIdx.x == 327) && (blockIdx.x == 7)) printf("....", ...);

Or you can use one of the cuda debuggers

senanur · May 24, 2016, 4:25am

I haven’t tried that, because actually in other runs, I’ve got 3 errors happened in different threads. I’m sorry, I guess I should have include these earlier.

========= CUDA-MEMCHECK
========= Invalid __global__ read of size 4
=========     at 0x000000e0 in evaluate(int*, int*, int*, int, int*)
=========     by thread (358,0,0) in block (7,0,0)
=========     Address 0x3061c46fc is out of bounds
...

========= CUDA-MEMCHECK
========= Invalid __global__ read of size 4
=========     at 0x000000e0 in evaluate(int*, int*, int*, int, int*)
=========     by thread (333,0,0) in block (7,0,0)
=========     Address 0x3061c46fc is out of bounds

========= CUDA-MEMCHECK
========= Invalid __global__ read of size 4
=========     at 0x000000e0 in evaluate(int*, int*, int*, int, int*)
=========     by thread (237,0,0) in block (7,0,0)
=========     Address 0x3061c46fc is out of bounds

The only consistent things from those 3 are the memory address and the block where the thread resides. I have update the first post to include the memory address that is out of bounds. The errors in my first post happened in a 50 runs and 3 errors I mentioned in this post happened in another 100 runs. In both runs I’m using 8 blocks and 512 threads. The card is GT 640.

I actually have tried using cuda-gdb, but I have no idea how to print an array content using its memory address. Also, after finding out that another error happened because of different memory address, I’m not sure where to look at.

BulatZiganshin · May 24, 2016, 11:03am

.

Topic		Replies	Views
Invalid __global__ write how to determine the wriight line? CUDA Programming and Performance	1	783	October 31, 2011
How to check GPU kernel's error? CUDA Programming and Performance	7	5053	January 22, 2019
Invalid __global__ write how to determine the wriight line? CUDA Programming and Performance	0	2533	October 31, 2011
Memcheck CUDA Programming and Performance	2	590	July 20, 2017
cuda-memcheck error: Address is out of bounds. CUDA Programming and Performance	2	5875	November 12, 2012
What is wrong? please help me,thanks CUDA Programming and Performance	2	1735	December 13, 2017
Suspect invalid global read with cuda-memcheck CUDA Programming and Performance	3	2090	February 16, 2016
Tracking down CUDA illegal memory access CUDA Programming and Performance	1	1325	February 20, 2015
Memcheck error, dynamic parallelism and address Compute Sanitizer	4	1193	November 16, 2023
When occur these errors in CUDA CUDA Programming and Performance	2	499	September 25, 2018

Tracking Invalid read size and illegal memory access

Related topics