Help with handling large arrays on a multi-GPU platform.

Hi. I’ve got 3 C1060s sitting inside a machine with 24GB of RAM, configured 64 bit. I have set the OMP_STACKSIZE to a decent number (5GB). I have a 2.35GB array set as SHARED in an openMP PARALLEL DO loop, with all private arrays much smaller than this for the three openMP threads I’m using to manage my 3 GPUs. I am really struggling to handle arrays much larger than this. If I am not getting bus errors and segmentation faults, I’m getting this at compile time - can anyone explain what all this output means?

Thanks,

Rob.

[cuda001 LOCUST-GPU_2.4]$ pgfortran -fast -o test.out prec_mod.cuf ddreac_mod.cuf neutrons.cuf -mp
prec_mod.cuf:
ddreac_mod.cuf:
neutrons.cuf:
/opt/local/lib/libpgf90.a(initpar.o): In function `__hpf_myprocnum':
initpar.c:(.text+0x2): relocation truncated to fit: R_X86_64_PC32 against symbol `__hpf_lcpu' defined in COMMON section in /opt/local/lib/libpgf90.a(initpar.o)
/opt/local/lib/libpgf90.a(initpar.o): In function `__hpf_ncpus':
initpar.c:(.text+0x12): relocation truncated to fit: R_X86_64_PC32 against symbol `__hpf_tcpus' defined in COMMON section in /opt/local/lib/libpgf90.a(initpar.o)
/opt/local/lib/libpgf90.a(initpar.o): In function `__hpf_getioproc':
initpar.c:(.text+0x22): relocation truncated to fit: R_X86_64_PC32 against symbol `__hpf_ioproc' defined in COMMON section in /opt/local/lib/libpgf90.a(initpar.o)
/opt/local/lib/libpgf90.a(initpar.o): In function `__hpf_is_ioproc':
initpar.c:(.text+0x32): relocation truncated to fit: R_X86_64_PC32 against symbol `__hpf_ioproc' defined in COMMON section in /opt/local/lib/libpgf90.a(initpar.o)
initpar.c:(.text+0x38): relocation truncated to fit: R_X86_64_PC32 against symbol `__hpf_lcpu' defined in COMMON section in /opt/local/lib/libpgf90.a(initpar.o)
/opt/local/lib/libpgf90.a(initpar.o): In function `__hpf_abort':
initpar.c:(.text+0x5f): relocation truncated to fit: R_X86_64_PC32 against symbol `__hpf_lcpu' defined in COMMON section in /opt/local/lib/libpgf90.a(initpar.o)
/opt/local/lib/libpgf90.a(initpar.o): In function `__hpf_abortp':
initpar.c:(.text+0xeb): relocation truncated to fit: R_X86_64_PC32 against symbol `__hpf_lcpu' defined in COMMON section in /opt/local/lib/libpgf90.a(initpar.o)
/opt/local/lib/libpgf90.a(initpar.o): In function `__hpf_initarg':
initpar.c:(.text+0x127): relocation truncated to fit: R_X86_64_PC32 against `.bss'
initpar.c:(.text+0x151): relocation truncated to fit: R_X86_64_PC32 against `.bss'
initpar.c:(.text+0x17b): relocation truncated to fit: R_X86_64_PC32 against `.bss'
initpar.c:(.text+0x18b): additional relocation overflows omitted from the output

Hi Rob,

The “relocation truncated to fit: R_X86_64_PC32 against symbol” errors occur when using static arrays larger than 2GB. For these large arrays you need to use the medium memory model (-mcmodel=medium) instead of the default small memory model. Unfortunately, the Medium Memory model is not yet supported for use with CUDA Fortran.

You should be able to dynamically allocate your memory to get around the relocation errors, though I don’t think will solve the other errors you getting. Any more information you can share?

  • Mat