Hi, I’m new to nvfortran compiler, I’m trying to test the do concurrent examples provided in the HPC SDK but I have a runtime error that I’m not sure how to handle:
I just took the saxpy.f90 example
Workstation: Windows PC with a GPU A6000 / compiling in WSL2 Ubuntu20.04 + hpc_sdk 2023
> nvfortran --version
nvfortran 23.1-0 64-bit target on x86-64 Linux -tp skylake-avx512
If I compile & run the saxpy example without -stdpar, everything goes fine. If I include -sdtpar
> nvfortran -stdpar -Minfo -fast saxpy.f90
saxpy_concurrent:
35, Generated vector simd code for the loop
36, FMA (fused multiply-add) instruction(s) generated
main:
50, Loop not fused: function call before adjacent loop
Generated vector simd code for the loop
64, Loop not fused: function call before adjacent loop
Generated vector simd code for the loop containing reductions
Runing the binary outputs:
> ./a.out
Current file: /mnt/d/.../saxpy.f90
function: saxpy_concurrent
line: 26
This file was compiled: -acc=gpu -gpu=cc50 -gpu=cc60 -gpu=cc60 -gpu=cc70 -gpu=cc75 -gpu=cc80 -gpu=cc80 -
malloc(): invalid next size (unsorted)
Aborted
Where Line 26 corresponds to the “do concurrent” loop
Is this maybe a compatibility issue with the GPU? Any hints on how could I solve this?
Thanks!!