I am stuck with a “No device symbol for address reference” message with nvfortran from nvhpc/22.3. I try to offload a small piece of code, letting the compiler doing it’s defaults work in a first time:
el_grp => grid%el_grps(n)%ptr
pair2node1_val => el_grp%pair2node1%val
pair2node2_val => el_grp%pair2node2%val
r1_ptr_val => data_ptr%r1_ptrs(n)%ptr%val
sym_op_val => sym_op_ptr%r1_ptrs(n)%ptr%val
prod_r1_val => product_ptr%r1_ptrs(n)%ptr%val
prod_r1_val(1:el_grp%nnode) = 0.0_WP
niter = el_grp%npair
!$omp parallel do
ino1 = pair2node1_val(ip)
ino2 = pair2node2_val(ip)
coeff = sym_op_val(ip)*(r1_ptr_val(ino2)-r1_ptr_val(ino1))
prod_r1_val(ino1) = prod_r1_val(ino1) + coeff
prod_r1_val(ino2) = prod_r1_val(ino2) - coeff
!$omp end target
val attributes are one dimensional double precision or integer allocatable arrays in user defined type.
-Minfo shows data movements (with
tofrom and correct shape).
Generating implicit map(tofrom:pair2node1_val(:),sym_op_val(:),r1_ptr_val(:),prod_r1_val(:),pair2node2_val(:))
But compilation abort with:
NVFORTRAN-W-0155-Compiler failed to translate accelerator region (see -Minfo messages): No device symbol for address reference
And I do not where to track this problem. Compilation is:
/opt/nvidia/hpc_sdk/Linux_x86_64/22.3/comm_libs/mpi/bin/mpifort -c -O1 -mp=gpu -gpu=cc80 -target=gpu -Minfo=accel ....
Thanks for any suggestion.
The error means that the compiler can’t find a device symbol for one or more of the pointers. Though I don’t know exactly what’s causing it. Can you please provide a reproducing example so I can investigate?
I will try to create simplified user defined type just involving the attributes used there.
Is there a way to know which device symbol is not found ?
I’ve worked on this problem simplifying more and more the code until it does nothing interesting but shows the problem with the
No device symbol message. The short test case is attached:
defs_m.f90 (977 Bytes)
linear_solver_mat_op_m.f90 (1.6 KB)
Makefile (684 Bytes)
It does’nt build an executable (no main program is provided).
A minimal offloaded kernel is implemented in a module in linear_solver_mat_op_m.f90. This module uses definitions from another module implemented in defs_m.f90 (in the real code, here they are not needed)
If in the defs_m.f90 module file I remove line 20:
20 !$OMP THREADPRIVATE(nsolver,current_solver,debug_level,dummy_int,dummy_real)
compilation is successfull. If the line is present I have the
No device symbol error, even if in this case the variables in the threadprivate directive are not used here (but in the real code, mixing MPI and OpenMP I need them).
An idea abou this ?
Thanks Patrick, this is helpful and I can reproduce the error.
I suspect what’s going on is when the compiler outlines the target region (outlining basically creates a function that’s then passed to the runtime), it’s also bringing over the module variables. Because of the outlining, it doesn’t know if they are used or not. But since these are threadprivate, the actual reference in the module is different than the one used at runtime.
The work around is to use the “loop” construct which doesn’t outline:
!$omp target teams map(tofrom:coeff)
coeff(ip) = ip
!$omp end target teams
I filed TPR #31776 and sent it to engineering for review.