Illegal address error when passing vector by reference

pavelp · April 3, 2019, 7:04pm

I’m getting the following error when I run a simple C++ program with an OpenACC pragma: “call to cuStreamSynchronize returned error 700: Illegal address during kernel execution.”

Here’s the full program:

#include <vector>

void processArray(std::vector<int> &arr) {
  int size = arr.size();
  #pragma acc parallel loop
  for (int i = 0; i < size; i++) {
    arr[i] = i;
  }
}

int main(void) {
  std::vector<int> arr(1);
  processArray(arr);
  return 0;
}

However, there’s no error if array processing happens inside “main”:

#include <vector>

int main(void) {
  std::vector<int> arr(1);

  int size = arr.size();
  #pragma acc parallel loop
  for (int i = 0; i < size; i++) {
    arr[i] = i;
  }

  return 0;
}

Could someone please help me understand what the issue with the former example is?

Compilation command:

pgc++ -acc -ta=tesla:managed main.cpp -o main

GPU: GeForce GTX 1060 6GB
Compiler: pgc++ 18.10-1 64-bit target on x86-64 Linux -tp zen
OS: Ubuntu Linux 18.04.2 LTS (Bionic Beaver)

MatColgrove · April 3, 2019, 8:37pm

Hi pavelp,

CUDA Unified Memory only manages dynamic data, not static. So while arr’s data will be managed, arr itself isn’t.

In the second working example, the compiler is implicitly copying arr for you. However in the first example arr is a reference so the compiler wont be able to implicitly copy it. The fix is to explicitly copy in arr.

% setenv PGI_ACC_TIME 1
% cat test1.cpp
#include <vector>

void processArray(std::vector<int> &arr) {
  int size = arr.size();
  #pragma acc parallel loop copyin(arr)
  for (int i = 0; i < size; i++) {
    arr[i] = i;
  }
}

int main(void) {
  std::vector<int> arr(1);
  processArray(arr);
  return 0;
}
% pgc++ -ta=tesla:managed -Minfo=accel test1.cpp; a.out
processArray(std::vector<int, std::allocator<int>> &):
      5, Generating copyin(arr[:])
         Generating Tesla code
          6, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
std::vector<int, std::allocator<int>>::operator [](unsigned long):
      1, include "vector"
          57, include "vector"
                7, include "stl_vector.h"
                   771, Generating implicit acc routine seq
                        Generating acc routine seq
                        Generating Tesla code

Accelerator Kernel Timing data
/local/home/colgrove/test1.cpp
  _Z12processArrayRSt6vectorIiSaIiEE  NVIDIA  devicenum=0
    time(us): 786
    5: compute region reached 1 time
        5: kernel launched 1 time
            grid: [1]  block: [128]
             device time(us): total=765 max=765 min=765 avg=765
            elapsed time(us): total=832 max=832 min=832 avg=832
    5: data region reached 2 times
        5: data copyin transfers: 1
             device time(us): total=21 max=21 min=21 avg=21

Hope this helps,
Mat

Topic		Replies	Views
Error for "call to cuStreamSynchronize returned error 7 Legacy PGI Compilers	4	8244	August 27, 2014
error: Failing in Thread:1 call to cuStreamSynchronize returned error 700: Illegal address during kernel execution Legacy PGI Compilers	2	2591	August 6, 2019
call to cuMemFreeHost returned error 700: Illegal address du Legacy PGI Compilers	2	1451	February 14, 2019
Unexpected cuStreamSynchronize error Legacy PGI Compilers	1	3614	March 27, 2015
OpenACC loop with "larger steps" Legacy PGI Compilers	1	4063	June 9, 2017
Missing branch target block - what does this mean ? Legacy PGI Compilers	2	3541	June 16, 2015
Runtime error 700: Illegal address during kernel execution nvc, nvc++ and nvfortran	3	264	April 30, 2024
Illegal address during kernel execution for large grids Legacy PGI Compilers	2	9337	June 23, 2014
How to debug "Illegal address during kernel execution&q Legacy PGI Compilers	1	2089	August 29, 2018
OpenACC reporting "Illegal address during kernel execut Legacy PGI Compilers	5	14411	January 12, 2017

Illegal address error when passing vector by reference

Related topics