Device access segmentation fault


I’m beginning to explore porting an existing OMP parallelized modeling code to my RTX 3050. However, I have come across the sort of issue only a noob can - a standalone code from one of the SDK examples works fine. However, my model code doesn’t seem to be able to detect the GPU. An attempt to allocate device variables, or even run cudaGetDeviceCount runs into a seg fault. Compilation/linking flags are consistent between the standalone code and my model code. I’m at a loss for how to go about debugging this. I’m compiling with nvfortran 22.3 on Ubuntu 18.04, with flags “-Mextend -Msave -byteswapio -fastsse -Mconcur=nonuma -Mpreprocess -Mfixed -Mcuda”. Paring this down to “-Mextend -Mcuda” didn’t help either. Any suggestions will be appreciated.


A standard noob trick is to start with working example code and modify it in small increments using version control. Typically what happens is that pretty quickly, one of the changes renders the code inoperable. One then reverts the last change and studies the revelant code diff while consulting the documentation, which provides a learning opportunity. Rinse and repeat, learning more and more things about the programming environment along the way.

Debugging a largish swath of code from first principles using standard debugging techniques like code bisection, debug output, asserts, logging, etc. etc. tends to be an exercise in frustration for a noob.

Unfortunately, my issue seems to be more pre-noob than noob…

What I did was introduce two lines into a working CPU code - added ‘use cudafor’ at the beginning of a subroutine containing loops to be run on the GPU, and added ‘istat = cudaSetDevice(0)’. The second statement is what causes the seg fault. In fact, any attempt to access the device seems to cause a seg fault. So for instance, declaring allocatable device variables works but attempting to allocate those device variables causes a seg fault. A standalone example code from the SDK has no issues whatsoever. The only possibility I can think of is some legacy settings/commands in my Fortran 77 code that may be causing a conflict with CUDA Fortran. If you are aware of any such conflicts, or have any other suggestions, I would appreciate it. Would help graduate me to a noob…

The offending operation turned out to be a -Bstatic flag in the linking operation in my Makefile. Must be that CUDA does not like static executables! Removing this compiler flag solved the problem for me, no more seg faults.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.