Enter Data + -O1

I compiled ZFS (RWTH Aachen) with -O0 and all worked fine.
When changing to -O1, the compilation works, but the program run fails when trying to copy data to the GPU.
Without changing anything, only running the program again leads to different erros:

launch CUDA kernel  file=(runtime) function=/proj/ta/build/15S/linux86-64/rte/accel-uni/hammer/lib-linux86-64-pic/../src/cuda_fill.c line=31 device=0 num_gangs=1 num_workers=1 vector_length=128 grid=1 block=128
zsh: segmentation fault (core dumped)  $HOME/DG_acc/src/zfs

or if this is passed:

FATAL ERROR: variable in data clause is partially present on the device: name=flux
 file:/home/mn652949/DG_acc/src/zfsdgblock.cpp _ZN10ZFSDgBlockILi3E26ZFSDgSysEqnAcousticPerturbILi3EEE10initGpuMemEv line:4199
zsh: segmentation fault (core dumped)  $HOME/DG_acc/src/zfs

When using pinned memory, stranger things happen. Sometimes the execution fails with:

call to cuMemHostRegister returned error 999: Unknown

or nothing is copied and it fails at the first present clause.

I use PGC++ 15.4

How can these things happen by just changing from -O0 to -O1?


If you get different results from run-to-run, something is getting clobbered or it is reading uninitialized data. This can happen with GPUs if the copy in and/or copy out is incorrect. Or, it could also be compiler bugs. If the problem happens early in the run, you might be able to debug a little bit by turning on some environment variables which give you some runtime information of data transfers and kernel launches. See the OpenACC getting started guide, section 3.3