Cpp2 TERMINATED by signal 11

Hello,

I am trying to compile the following code with nvc++ on Ubuntu 18.04.6 LTS with a Tesla T4:

#include <algorithm>
#include <execution>
#include <numeric>
#include <random>
#include <vector>

int main() {
  std::vector<int> indexes(1000);
  std::transform_reduce(std::execution::par, indexes.begin(), indexes.end(), float(), std::plus<float>(), [](const auto i) {
    std::random_device random_device;
    std::mt19937 generator(random_device());
    std::normal_distribution<float> gaussian(0, 1);
    return gaussian(generator);
  });
}

However it is failing with signal 11 like so:

$ export CUDA_HOME=/opt/nvidia/hpc_sdk/Linux_x86_64/21.9/cuda/11.4
$ nvc++ --version

nvc++ 21.9-0 64-bit target on x86-64 Linux -tp skylake 
NVIDIA Compilers and Tools
Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES.  All rights reserved.
$ nvc++ -stdpar=gpu -fast -O3 -DNDEBUG --c++17 main.cppnvc++-Fatal-/opt/nvidia/hpc_sdk/Linux_x86_64/21.9/compilers/bin/tools/cpp2 TERMINATED by signal 11
Arguments to /opt/nvidia/hpc_sdk/Linux_x86_64/21.9/compilers/bin/tools/cpp2
/opt/nvidia/hpc_sdk/Linux_x86_64/21.9/compilers/bin/tools/cpp2 main.cpp -opt 3 -terse 1 -inform warn -x 119 0xa10000 -x 122 0x40 -x 123 0x1000 -x 127 4 -x 127 17 -x 19 0x400000 -x 28 0x40000 -x 120 0x10000000 -x 70 0x8000 -x 122 1 -x 125 0x20000 -quad -vect 56 -y 34 16 -x 34 0x8 -x 32 25952256 -y 19 8 -y 35 0 -x 42 0x30 -x 39 0x40 -x 199 10 -x 39 0x80 -x 59 4 -tp skylake -x 120 0x1000 -astype 0 -x 121 1 -fn main.cpp -il /tmp/nvc++QdCd_K5nT2O6.il -x 117 0x200 -x 123 0x80000000 -x 123 4 -x 119 0x20 -def __pgnu_vsn=70500 -x 70 0x40000000 -x 183 4 -x 121 0x800 -x 6 0x20000 -autoinl 10 -x 168 400 -x 174 128000 -x 14 0x200000 -x 14 0x400000 -autoinl 10 -x 168 400 -x 174 128000 -x 14 0x200000 -x 14 0x400000 -autoinl 10 -x 168 400 -x 174 128000 -x 14 0x200000 -x 14 0x400000 -autoinl 10 -x 168 400 -x 174 128000 -x 14 0x200000 -x 14 0x400000 -x 249 110 -x 120 0x200000 -x 70 0x40000000 -x 8 0x40000000 -x 164 0x800000 -x 85 0x2000 -x 85 0x4000 -x 34 0x40000000 -x 53 0x800000 -x 15 0x4 -x 206 0x02 -x 68 0x1 -x 39 4 -x 56 0x10 -x 26 0x10 -x 26 1 -x 56 0x4000 -accel tesla -x 180 0x4000400 -x 121 0xc00 -x 186 0x80 -x 176 0x100 -cudacap 75 -x 163 0x1 -x 186 0x80000 -cudaver 11040 -x 194 0x40000 -cudaroot /opt/nvidia/hpc_sdk/Linux_x86_64/21.9/cuda/11.4 -x 180 0x4000400 -x 121 0xc00 -y 210 8 -x 189 0x8000 -y 163 0xc0000000 -x 201 0xf0000000 -x 189 0x10 -y 189 0x4000000 -cudaroot /opt/nvidia/hpc_sdk/Linux_x86_64/21.9/cuda/11.4 -x 187 0x40000 -x 187 0x8000000 -x 9 1 -x 42 0x14200000 -x 72 0x1 -x 136 0x11 -x 37 0x480000 -x 9 1 -x 42 0x14200000 -x 72 0x1 -x 136 0x11 -x 37 0x480000 -x 194 0x20000000 -x 198 0x100 -x 9 1 -x 42 0x14200000 -x 72 0x1 -x 136 0x11 -x 37 0x480000 -x 129 2 -quad -x 119 0x10000000 -x 129 0x40000000 -x 129 2 -quad -x 119 0x10000000 -x 129 0x40000000 -x 56 0x2 -x 9 1 -x 42 0x14200000 -x 72 0x1 -x 136 0x11 -x 37 0x480000 -gnuvsn 70500 -x 69 0x200 -x 123 0x400 -x 137 1 -cmdline '+nvc++ /tmp/nvc++QdCd_K5nT2O6.il -stdpar=gpu -fast -Mvect=simd -Mflushz -Mcache_align -O3 -Mvect=simd -Mflushz -Mcache_align -Mrecip-div -Mfactorize -DNDEBUG --c++17' -asm /tmp/nvc++kdCdEY3VB1J0.ll

I saw another topic that suggested that the library is not fully supported with nvc++ yet - is that the reason for this segfault?

If so, is there another way that I can use the library with nvc++? I’d rather stick with standard C++ for my current use case and not use CUDA if possible.

Correct that I don’t believe we’ve added support for random number generation in device code, however the compiler shouldn’t segv. Hence I added a problem report, TPR #30999, and sent it to engineering for further investigation.

Note that the segv only occurs at -O2 and above. At -O1, the compiler gives an error:

% nvc++ test.cpp -stdpar -V21.9 -O0 --c++17
NVC+±F-0155-Compiler failed to translate accelerator region (see -Minfo messages): Unsupported statement operator (test.cpp: 1)
NVC++/x86-64 Linux 21.9-0: compilation aborted

-Mat

In 22.5, our engineers fixed the compiler segv. The problem had to do with the mishandling of “long double” which is used by the random number generator and isn’t supported in device. Hence the compiler now gives the following error message:

% nvc++ test.cpp -stdpar -V22.5 -O2 --c++17
NVC++-S-1207-Long double in GPU code is unsupported (test.cpp: 1791)
NVC++-F-0155-Compiler failed to translate accelerator region (see -Minfo messages): Long double in GPU code is unsupported (test.cpp: 1)
NVC++/x86-64 Linux 22.5-0: compilation aborted

-Mat

Hi Mat, thanks for following up on this.

Does this mean that if the random number generator implementation didn’t use long double then it might work on device? Or is there some other reason why the <random> header isn’t yet ready for stdpar?

On a separate note, it would be nice if nvc++ provided clearer error messages when an unsupported standard library type is being used in device code. It’s not clear to me that the Long double in GPU code is unsupported message is because of an included standard library header, and it masks the real issue, which is that <random> is not yet supported on device. In this situation I think it would help if nvc++ emitted an error message like <header> is not supported on device instead.

Lastly, is there any update on when <random> will be available on device? I’m still very interested in having this available with stdpar and would be glad to help test it out if possible.

Sorry, I don’t know but would suspect there’s more to it. My experience with other RNGs, given they need to maintain state, they can be tricky to implement in a parallel context.

Lastly, is there any update on when <random> will be available on device?

I asked one of our engineers about this. Our long term goal is to provide our own complete device capable STL called libnv++. Though it’s an ambitious project so taking much longer to implement than hoped. It’s still a goal, but likely to change from what Bryce originally proposed in his GTC talk. Though, priority-wise, std::random would low on the list.

I don’t know if this would be useful, but I worked with another gentleman on a RNG for use with OpenACC: https://www.openacc.org/blog/pseudo-random-number-generation-lightweight-threads

Though the implementation is agnostic to the parallization method so might work here as well. Granted, I haven’t tried so can’t say for sure.

On a separate note, it would be nice if nvc++ provided clearer error messages when an unsupported standard library type is being used in device code.

I’ll ask, but it might not be easy to detect given these at templates so by the time the compiler sees the code, it’s difficult to know that it’s coming from “random”.

-Mat

NVC+±S-1207-Long double in GPU code is unsupported (test.cpp: 1791)

By the way, does the 1791 in the output refer to a line number? If so, is there any way for me to see what is on that line?

I tried compiling with -E and -P to see the source file after preprocessing, but there’s nothing on line 1791.

Or alternatively is there a way for nvc++ to print out the line in question like g++ or clang++ does for compilation warnings and errors?

It is the line number, but the line number from the header file which is being included. Since it’s all templated, might be hard to find.

I’d use “-P” so all the header file info is included in the post-processed file, and then find the new line number. Something like:

% nvc++ test.cpp -stdpar -V22.5 -O2 --c++17 -P
% nvc++ test.i -stdpar -V22.5 -O2 --c++17 --no_preincludes -w
NVC++-S-1207-Long double in GPU code is unsupported (test.i: 295649)
NVC++-F-0155-Compiler failed to translate accelerator region (see -Minfo messages): Long double in GPU code is unsupported (test.i: 1)
NVC++/x86-64 Linux 22.5-0: compilation aborted

I’d use “-P” so all the header file info is included in the post-processed file, and then find the new line number

That’s a good idea, thanks!

For people who don’t know the -P trick, I still think it would be helpful if nvc++ printed out the full source code line in question, or at least the name of the header file being included in addition to the line number. Would you be able to raise this request to the engineering team for me?

I can ask, though no guarantees.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.