Disabling optimization on specific source files (nvc++)

pledac · August 31, 2023, 5:12am

Hi,

I encounter some weird crash (seg fault) when running my C++ code when built with nvc++ -O3. We have no issues with other compilers (GNU, Intel) and there is no valgrind error. It is hard to provide you a reproducer. So I would like to compile some specific files where crashes occurred (or specific part of the files) with -O0 cause when the code is built with nvc++ -O0, it is OK. I forgot to say that -Mnovect -Mnunrool has no effect too.
Unhappily, after reading the compiler doc, I don’t find any pragma to disable optimization locally. Am i wrong ?
Thanks

MatColgrove · August 31, 2023, 4:29pm

We use to document these, but I think due to Fortran sentinels being “!$pgi”, they removed them from the docs when we rebranded to nvhpc. Though the pragmas and directives are still there.

In this case you want the “opt” pragma which can either be applied to the whole file or a particular routine depending on the scope in which it’s use. For example:

#pragma opt 0    << this would apply -O0 to the whole file
void foo(float *Arr, int sze) {
#pragma opt 0   << this applies -O0 to a single routine
   for (int i =0; i < sze; ++i) {
        Arr[i] = i/10;
   }
}

Hope this helps,
Mat

pledac · September 1, 2023, 6:21am

Thanks Mat for the quick reply !

I didn’t succeed using #pragma opt 0 in my code. For your interest, I had a crash at the bold line (arithmetic exception). I solved it by adding a DMINFLOAT value (1.e-30) several lines after where the crash really happened to avoid the per 0 division by l variable despite the test on l variable. It is not the first time I saw that the compiler optimize (vectorize?) the code but omit the conditional test.

Again thanks for your help.

double Maillage_FT_Disc::calcul_normale_3D(int num_facette, double norme[3]) const
{
double l = -1.;
int s0 = facettes_(num_facette,0);
int s1 = facettes_(num_facette,1);
int s2 = facettes_(num_facette,2);
double x0 = sommets_(s0,0);
double y0 = sommets_(s0,1);
double z0 = sommets_(s0,2);
double dx1 = sommets_(s1,0) - x0;
double dy1 = sommets_(s1,1) - y0;
double dz1 = sommets_(s1,2) - z0;
double dx2 = sommets_(s2,0) - x0;
double dy2 = sommets_(s2,1) - y0;
double dz2 = sommets_(s2,2) - z0;

//calcul normale : pdt vectoriel
norme[0] = dy1 * dz2 - dy2 * dz1;
norme[1] = dz1 * dx2 - dz2 * dx1;
norme[2] = dx1 * dy2 - dx2 * dy1;
l = sqrt(norme[0] * norme[0] + norme[1] * norme[1] + norme[2] * norme[2]);
if (l != 0.)
{
ifdef __NVCOMPILER
double inv_l = 1. / (l + DMINFLOAT);
else
double inv_l = 1. / l;
endif
norme[0] *= inv_l;
norme[1] *= inv_l;
norme[2] *= inv_l;
}

return l*0.5;
}

pledac · September 1, 2023, 8:04am

Hmm, we were also enabling arithmetic exceptions to be detected in our code:

feenableexcept(FE_DIVBYZERO | FE_INVALID | FE_OVERFLOW);

It is another workaround for us to disable it for the nvc++ compiler.

MatColgrove · September 1, 2023, 4:39pm

It is another workaround for us to disable it for the nvc++ compiler.

Yes, not enabling trapping will be your best option.

Trap safety is a known issue due to instruction selection by our back-end LLVM code generator. There are ways to force LLVM to be trap-safe, but the resulting performance is very poor, so much so that it’s highly unlikely you’d want to enable it.

Topic		Replies	Views
Nvfortran-24.9 Regression: runtime segfault with optimization nvc, nvc++ and nvfortran	5	49	January 27, 2025
Nvc/nvc++ hang compiling C code, nvcc/gcc & others succeed nvc, nvc++ and nvfortran	12	978	December 4, 2024
Segfault with nvfortran nvc, nvc++ and nvfortran	2	302	March 30, 2024
Optimization disablement using pragmas. Legacy PGI Compilers	6	6004	August 26, 2011
Nvc/21.7 regression: Internal compiler error. Can only coerce indirect args nvc, nvc++ and nvfortran	4	650	September 28, 2021
nvcc -O0 not working (CUDA 3.2) CUDA Programming and Performance	9	16377	December 16, 2011
Nvfortran generates simd instructions for scalar operations which cause floating exception nvc, nvc++ and nvfortran	1	514	February 24, 2023
enable_language(CUDA) ignores NVCC Compiler flags CUDA Programming and Performance	6	5280	August 10, 2023
Integration of nvfotran inside a CMake nvc, nvc++ and nvfortran	7	761	April 26, 2023
nvopencc internal error: out of float registers CUDA Programming and Performance	4	7050	February 3, 2011

Disabling optimization on specific source files (nvc++)

Related topics