Thanks for opening the issue report – much appreciated!
Changing the build system ended up being very easy. When building with static instead of dynamic linking, this bug goes away, so I think I can confirm that the shared object issue was indeed what was causing our problem. While this is a good workaround for now, we do have need to compile as a shared object for multiphysics simulations (with Nek5000/NekRS) down the line.
Anyway, while static linkage seems to have gotten around that bug, I did run into another one farther along in the program’s runtime. With the original dynamic linkage, the code was giving an error immediately at program launch:
./openmc --event
call to cuModuleGetGlobal returned error 500: Not found
However, there is different bug now with static linkage. The program now runs through initialization and file I/O routines cleanly, but when it gets to the first big kernel, it gives the following error:
./openmc --event
...
...
(OpenMC initialization, file I/O, etc)
...
...
Fatal error: expression 'HX_CU_CALL_CHECK(p_cuStreamSynchronize(stream[dev]))' (value 1) is not equal to expression 'HX_SUCCESS' (value 0)
Aborted
I tried setting the debug environment variable, but the messages didn’t point to anything obvious to me. Perhaps you can help me decode the debug messages? Below are the last 10-20 lines or so:
pgi_uacc_upstart( file=/home/<...>/openmc_offload/openmc/include/openmc/shared_array.h, function=_ZN6openmc27process_calculate_xs_eventsERNS_11SharedArrayINS_14EventQueueItemEEE, line=113:149, line=147, devid=1 )
pgi_uacc_dataupa(devptr=0x1,hostptr=0xffc3e0,stride=1,size=1,extent=-1,eltsize=24,lineno=-147,name=_in_44954,flags=0x40020400=copyin+dynamic+openmp,async=-1,threadid=1)
pgi_uacc_dataupx(devptr=0x7fcf27e04000,hostptr=0xffc3e0,stride=1,size=1,extent=-1,eltsize=24,lineno=147,name=_in_44954,async=-1,threadid=1)
pgi_uacc_cuda_dataup1(devdst=0x7fcf27e04000,hostsrc=0xffc3e0,offset=0,stride=1,size=1,eltsize=24,lineno=147,name=_in_44954,threadid=1)
pgi_uacc_updone( devid=1 )
pgi_uacc_cuda_wait(lineno=-99,async=-1,dindex=1,threadid=1)
pgi_uacc_cuda_wait(sync on stream=0x21267a0,threadid=1)
pgi_uacc_cuda_wait done (threadid=1)
pgi_uacc_upstart( file=/home/<...>/openmc_offload/openmc/include/openmc/shared_array.h, function=_ZN6openmc27process_calculate_xs_eventsERNS_11SharedArrayINS_14EventQueueItemEEE, line=113:149, line=147, devid=1 )
pgi_uacc_dataupa(devptr=0x1,hostptr=0xffc3c0,stride=1,size=1,extent=-1,eltsize=24,lineno=-147,name=_in_44966,flags=0x40020400=copyin+dynamic+openmp,async=-1,threadid=1)
pgi_uacc_dataupx(devptr=0x7fcf27e03f00,hostptr=0xffc3c0,stride=1,size=1,extent=-1,eltsize=24,lineno=147,name=_in_44966,async=-1,threadid=1)
pgi_uacc_cuda_dataup1(devdst=0x7fcf27e03f00,hostsrc=0xffc3c0,offset=0,stride=1,size=1,eltsize=24,lineno=147,name=_in_44966,threadid=1)
pgi_uacc_updone( devid=1 )
pgi_uacc_cuda_wait(lineno=-99,async=-1,dindex=1,threadid=1)
pgi_uacc_cuda_wait(sync on stream=0x21267a0,threadid=1)
pgi_uacc_cuda_wait done (threadid=1)
pgi_uacc_get_device_num(devtype=4,threadid=1)
pgi_uacc_get_device_num(devtype=4,threadid=1) cuda devid=1 dindex=1 devnum=0
Fatal error: expression 'HX_CU_CALL_CHECK(p_cuStreamSynchronize(stream[dev]))' (value 1) is not equal to expression 'HX_SUCCESS' (value 0)
Aborted
Or, if you have any other ideas on what might be causing this second issue, please let me know. Thanks for all your help!