OMP offloading crash with nvc

Hello

We have updated our compiler to the Nvidia HPC SDK version 22.9 and we ran into an issue with the offloading.

We have a minimal example there

That works fine with the SDK 22.7 but not the SDK 22.9.

Cheers

If you add -mp=gpu to your linking, it seems to work

#!/bin/bash
nvc -g -fPIC -gpu=pinned -mp=gpu -fast -shared omp_crash.c -lm -o omp_crash.so
nvc -O3 -g -mp=gpu -Wall main.c -ldl -o main
./main

$ sh run.sh
Running foo()…
DONE!

Hi, unfortunately we don’t have control over the process performing the dynamic loading. In our case, it’s Python :) so we can’t easily do that. I’m guessing we could maybe use LD_PRELOAD but it gets nightmarish… and I’m not even that sure

context: this project GitHub - devitocodes/devito: Code generation framework for automated finite difference computation

There are three libraries needed.

LD_PRELOAD=libacchost.so:libaccdevaux.so:libaccdevice.so ./main
Running foo()…
DONE!

FWIW, this is a bit painful, for various reasons…

  1. installation / configuration. Fine, we have Dockerfiles, but still…
  2. maintainance. What if those libs change in the next release, or a new one is added

I would suggest to open a formal bug.
If you ldd omp_crash.so, those libraries are there and if you run with
LD_DEBUG=libs ./main
you can see that there are errors coming from pthreads.

191258: ./main: error: symbol lookup error: undefined symbol: pthread_atfork (fatal)
191258: ./main: error: symbol lookup error: undefined symbol: __dyn_pthread_atfork (fatal)

How do I open a bug report?

Here.

1 Like

It looks like the shared library works as expected from Python ( that you indicated was your real use case), at least from this simple example.

$ cat pydriver.py
from ctypes import cdll
lib = cdll.LoadLibrary(‘./omp_crash.so’)
lib.foo()

$ python pydriver.py
Running foo()…
DONE!