I observe a memory leak if kernels taking assumed-shape arrays as arguments are executed many times. The same does not happen with assumed-size or explicit-size arrays. I attach a small example that compares both.
Here some additional info:
nvfortran 25.9-0 64-bit target on x86-64 Linux -tp znver4
Driver Version: 580.105.08 CUDA Version: 13.0
NVIDIA RTX 6000 Ada Geneneration
assumed_shape_leak_test.cuf.txt (2.9 KB)
Hi hagen.radtke and thanks for the report!
It’s an interesting one. The memory leak is coming from the device descriptors which need to be allocated for assumed-shape arrays when passed as arguments to the kernel. I looked at the assembly and do see the cudaFree calls (via the “pgf90_dev_auto_dealloc_i8” runtime calls), but they don’t seem to get triggered.
I added an issue report, TPR #38635, and sent it to engineering for investigation.
It is recommended to use assumed-size arrays when possible with CUDA Fortran due to the overhead required to support device descriptors.
-Mat
Hi Mat, thanks for looking into it. Nice that I found something interesting. :) I hope it’s not too hard to fix and am looking forward to reading more soon. Cheers, Hagen