New 20.7 version , where is the detail release bugfix?

Hello .

I have switched from PGI to NVIDIA HPC compiler , and tested the first beta
nv-hpc/20.5 , just after is release .

The new official version nv-hpc/20.7 is release but in the document page
I don’t see where is the detail release bugfix coming with this new version .

On PGI , the Release Notes point to

For Example , some problem I have found with the beta version :

  • Nsigth = nsys-ui , doesn’t give information on OPENACC construct ( some library not found )

OpenACC injection initialization failed. Is the PGI runtime version greater than 15.7?

  • nvfortran , warn about compiling for -acc=multicore,gpu , but code is working fine apparently ( and this was OK with PGI version ) :

nvfortran-Warning-Code generation for OpenACC on device and multicore in the same binary is not currently supported


For the NVIDIA HPC SDK, we will not be publishing a complete list of bugs fixed in the release notes, but individuals who have reported an issue will still be notified once a fix has become available. Future release notes will include general user visible changes.

Hello Mat .

Have you reported somewhere the two problems I mention in my post , present also in the 20.7 version

-> nsys not reporting OpenACC directives
-> multicore not allowed with gpu -acc=multicore,gpu


Hi Juan,

Apologies. I missed the other questions.

What library isn’t being found? If so, then that’s the device side profiler so you may need to include the directory where it’s found in your LD_LIBRARY_PATH environment variable. With 20.7, it can be found in “/Linux_x86_64/20.7/profilers/Nsight_Systems/target-linux-x64/”

Though, Nsight-Systems does trace OpenACC by default so you need to include the “openacc” option in the trace flag (–trace, or -t) for the “profile” command.

% nsys --help profile
        -t, --trace=
           Possible values are 'cuda', 'nvtx', 'osrt', 'cublas', 'cudnn', 'opengl', 'mpi', 'openacc', 'openmp', 'vulkan' or 'none'.
           Select the API(s) to trace. Multiple APIs can be selected, separated by commas only (no spaces).
           If 'none' is selected, no APIs are traced.
           Default is 'cuda,opengl,nvtx,osrt'. Application scope.

nvfortran , warn about compiling for -acc=multicore,gpu , but code is working fine apparently ( and this was OK with PGI version ) :

It’s just a warning and works the same as before. We’ve actually never fully supported combining the two. It works in most cases, but we know of a few problems that we’ve not been able to resolve. They do work together for most codes, but the message is just asserting that we may not be able to fix an issue when it doesn’t work (i.e. supported).

Engineering decided to make this more explicit by adding the warning. Though, you’re right that the text of the message makes it seem like they can’t be used together, and we’re thinking ways to change the message to make this more clear. Apologies that the message is confusing.


Hello Mat .

-> Ok , for the warning -acc=multicore,gpu

-> For the OpenACC profiling , I use directly the Gui = nsys-ui , and on this I activated at the good place the reporting of OpenACC directives .

the Warning

OpenACC injection initialization failed. Is the PGI runtime version greater than 15.7?

is present in the nsys-ui tab “Diagnostic Summary” .

-> If I run the code via nprof , no problem , I got the OpenACC informations .

mpirun -np 1 nvprof --print-openacc-summary /…/MESONH

==4728== Profiling application: /…/MESONH
==4728== Profiling result:
Type Time(%) Time Calls Avg Min Max Name
OpenACC (excl): 15.89% 969.56ms 3 323.19ms 1.0540us 522.03ms acc_device_init
2.91% 177.82ms 11 16.166ms 15.414ms 17.068ms acc_compute_construct@spll_resolved_cloud.f90:480

-> Through nsys , the command said it will perform OpenACC profiling but

mpirun -np 1 nsys profile -t openacc /…/MESONH

WARNING: OpenACC, cuDNN and cuBLAS rely on CUDA. CUDA tracing has been automatically enabled.
Collecting data…

but again looking inside the report.qdrep with nsys-ui , show the same warning message in the diagnostic summary .
CUDA traces are present but not OpenACC ones .

-> For the , I have different version present

find 20.7 -name ‘’

But the one in nsys directory are name with some version number after .

ls -1 20.7/profilers/Nsight_Systems/target-linux-x64/*

So perhaps this one are not used do ti the version extension ?


I’ve not encountered this before myself so don’t know. Though I sent a note to our profiler team for ideas.

Word back from the profiler team is that this is an issue in the Nsight-System release 2020.3, the one that shipped with 20.7, which can cause tracing to fail. The issue should be addressed in the new 2020.4 release.

I’ve alerted folks here on the HPC Compiler team about the issue, but we’re not sure we can get the updated 2020.4 version integrated into our 20.9.

In the time being, can you try downloading and installing the 2020.4 release separately to see if that fixes your issue?

Hello Mat .

-> I’m logged on the developer site , but the download page for the nsight-systems run installer doest’n work

-> File not found

The rpm version download correctly , but I prefer not to use it


Thank you for letting us know, I am working on getting that fixed now.

Hello Mat .

I download a install the nsight-systems-2020-4 .

It’s now working correctly , running from nsys or nsys-ui <-> it collect OpenACC information .
and this information could be seen in the nsys-ui

But …

From the CLI command nsys , it is impossible to get directly statistical information for OpenACC .

As I read/understand the documentation , the equivalent of

nvprof a.out

is more or least

nsys profile -t openacc --stats=true a.out

But on the console , this command never give OpenACC stats as nvprof does !
The documentation said that it is possible to write new statistic script ( in python ), but I think at least, one script for OpenACC , should be packaged with the distribution .
REM : for example the -t mpi , give stats on MPI call !


I think there are actually two asks here for Nsight Systems.

One is that there should be a default script to give OpenACC statistics. I can see that that would be useful to our OpenACC customers and people using OpenACC as their first push to parallelism.

The second is that you would like the --stats command to be clever enough to change its reports to match what the user is specifically tracing. Right now the --stats command generates a set set of statistics regardless of what the user is doing, and you have to use nsys stats as a post process for other statistics.

I will file both of these requests in the system. I’m hopeful we can get the first one by the end of the year. Meanwhile you can use the sqlite directly to get the data.