f90 app barfs with "symbol lookup error" on __mth_

I have an app that runs partway, then bombs with the error:

./foo: symbol lookup error: ./foo: undefined symbol: __mth_i_dsincosx

It is compiled with -Bstatic_pgi -Msave -tp=p7 -fastsse

The compiler version info returns:

% pgf90 -V

pgf90 7.0-5 32-bit target on x86-64 Linux

Can you give me some idea of what it’s doing? I gather this is a PGI vector library function, but I don’t know why it’s going after this in a static build. All pointers welcome.

Pete

Hi Pete,

What flags are you using for linking? The PGI SSE2 libpgsse2.a library, which is where this symbol is found, should be linked in by default when “-tp=p7” is used. However, you might have a flag on the link line, such as -Mnoscalarsse, which would override this.

Also, try adding “-v” (verbose) to you link like to see which libraries are being passed to the linker.

  • Mat

The build is done through a makefile that passes the same flags to both. As near as I can tell they’re getting the same flags. This is one of those “it worked for me, doesn’t work for someone else” situations, so I’ll investigate further based on this information.

Pete

Nothing obviously wrong with the link here. This is the output from -v:

/usr/bin/ld /usr/lib/crt1.o /usr/lib/crti.o /usr/pgi70/linux86/7.0-5/lib/trace_init.o /usr/lib/gcc/x86_64-redhat-linux/3.4.6/32/crtbegin.o /usr/pgi70/linux86/7.0-5/lib/f90main.o -m elf_i386 -dynamic-linker /lib/ld-linux.so.2 /usr/pgi70/linux86/7.0-5/lib/pgi.ld -L/pw/test/linux/lib -L/pw/devl/linux/cfd/lib -L/pw/prod/linux/lib -L/usr/pgi70/linux86/7.0-5/lib -L/usr/lib -L/usr/lib/gcc/x86_64-redhat-linux/3.4.6/32 main.o comlist_mod.o module_solver_debug.o prowess_communication.o ieee_mod.o parallel_communication.o solver_amg_control_data.o solver_amg_dynamic_memory.o solver_amg_static_data.o swapbc_mod.o standard_vars_mod.o iobc_geometry.o iobc_vars.o eigen_grouping.o archiver.o j237_data.o j237_units.o control.o nameslist_control.o nameslist_input.o control_nameslist_data.o phase_bc_data.o phase_bc_procedure.o phase_shiftsize.o phase_io_restart.o phase_bc_temp_arrays.o wc_bc_data.o air_blocked_data.o when00_mod.o nastartmp.a -llapack -lblas -lcommonio2 -linputfileio -lprowess -lpkill -lcfutils -lprowess -o nastar.o -Bstatic -lpgmm1 -lpgmm2 -lpgf90 -lpgf90_rpm1 -lpgf902 -lpgf90rtl -lpgftnrtl -lpgsse1 -lpgsse2 -lnspgc -lpgc -Bdynamic -lrt -lpthread -lm -lgcc -lc -lgcc /usr/lib/gcc/x86_64-redhat-linux/3.4.6/32/crtend.o /usr/lib/crtn.o

Here’s a clip from the output of:

nm my_executable | fgrep mth_i

Of the many symbols that match, only dsincosx shows up as undefined.


[42200] | 147753216| 71|FUNC |GLOB |0 |12 |__mth_i_dpowix
[42687] | 0| 51|FUNC |GLOB |0 |UNDEF |__mth_i_dsincosx
[43491] | 147753296| 55|FUNC |GLOB |0 |12 |__mth_i_dsinx
[43596] | 147751856| 368|FUNC |GLOB |0 |12 |__mth_i_expx
[44122] | 147759248| 5|FUNC |GLOB |0 |12 |__mth_i_floatk
[43486] | 147759856| 154|FUNC |GLOB |0 |12 |__mth_i_ipowi
[44220] | 147759376| 39|FUNC |GLOB |0 |12 |__mth_i_kcmp
[42699] | 147759328| 39|FUNC |GLOB |0 |12 |__mth_i_kcmpz
[42776] | 147754448| 242|FUNC |GLOB |0 |12 |__mth_i_kdiv
[42921] | 147755504| 245|FUNC |GLOB |0 |12 |__mth_i_kicshft
[42635] | 147756016| 70|FUNC |GLOB |0 |12 |__mth_i_kishft
[42190] | 147759648| 61|FUNC |GLOB |0 |12 |__mth_i_klshift
[44318] | 147759280| 45|FUNC |GLOB |0 |12 |__mth_i_kmul
[43906] | 147759504| 76|FUNC |GLOB |0 |12 |__mth_i_krshift
[43562] | 147759456| 45|FUNC |GLOB |0 |12 |__mth_i_kucmp
[42725] | 147759424| 19|FUNC |GLOB |0 |12 |__mth_i_kucmpz
[42842] | 147759584| 61|FUNC |GLOB |0 |12 |__mth_i_kurshift
[42132] | 147752416| 41|FUNC |GLOB |0 |12 |__mth_i_nint
[42813] | 147752224| 35|FUNC |GLOB |0 |12 |__mth_i_nintx
[42233] | 147752272| 70|FUNC |GLOB |0 |12 |__mth_i_rpowix
[42533] | 147753360| 336|FUNC |GLOB |0 |12 |__mth_i_rpowrx
[43723] | 147752464| 51|FUNC |GLOB |0 |12 |__mth_i_sincosx
[43203] | 147752352| 55|FUNC |GLOB |0 |12 |__mth_i_sinx

Hi Pete,

Well, this is an odd one and is a case where it works fine for me. Try adding the following flag “-Wl,–verbose” to the link. This will pass the “–verbose” switch to the linker and have it display information about the link. Save this info to a log (there will be a lot of data) and grep for the string “pgsse2”. In particular, I’d like to see something like

attempt to open /usr/pgi/linux86/7.0-5/lib/libpgsse2.a succeeded
(/usr/pgi/linux86/7.0-5/lib/libpgsse2.a)dsincosx.o

to make sure the correct libpgsse2 is being linked and dsincosx.o has been used.

  • Mat

After a good bit more detective work, I got it running. Here’s an outline of the problem and solution.

The in-house libraries in the link (libcfutils, libprowess, etc) were built with a 6.x version of pgcc and contain some sort of reference to __mth_i_dsincosx. At link time, it appears that references to this function in the .o files resolve against this reference as dynamic relocations before the linker gets to the static PGI libs.

Rebuilding the in-house libraries with gcc makes the problem go away, but this is not a practical solution as the libraries would need extensive revalidation for production use. The workaround is to insert -Bstatic -lpgsse2 -Bdynamic right after the list of objects in the link line. This forces __mth_i_dsincosx to resolve against the static library before the linker gets to the in-house libraries.

It’s not clear to me why this function is a problem. I wonder if it might have something to do with the issue that was documented here:

http://www.pgroup.com/userforum/viewtopic.php?p=2459

I’m up to my elbows in link maps and other useful data that I can provide if you want to understand this further. However we’re back in business so I don’t need anything further.

Thanks,

Pete