CUDA Fortran Device Memory Allocation Fails with 'No accelerator device found for cudafor_acc_malloc' Error in HPC SDK 24.11 on WSL2 While CUDA C Work

Dear all,

I’m running into a frustrating issue with CUDA Fortran memory allocation that I can’t seem to resolve. The strange part is that CUDA C works perfectly fine, but any CUDA Fortran program fails when trying to allocate device memory.

Here’s the situation: I’m using HPC SDK 24.11 on WSL2 Ubuntu with a Quadro T1000. Basic CUDA functionality works great - nvidia-smi shows the GPU, CUDA C programs run fine, and even CUDA Fortran can detect the device and get its properties. However, as soon as any CUDA Fortran program tries to allocate device memory, it fails with:
Accelerator Fatal Error: No accelerator device found for cudafor_acc_malloc call
I’ve tried everything from complex programs to this minimal test case:

program minimal_cuda
    use cudafor
    implicit none
    real, device :: d_x
    real :: x = 1.0
    d_x = x
end program minimal_cuda

Even this fails with the same error. I’ve verified all libraries are present, paths are correct, and even did a clean installation. The peculiar thing is that the error mentions ‘cudafor_acc_malloc’ even when trying to use pure CUDA Fortran calls.

Has anyone encountered this issue or knows what might be causing it? Happy to provide more details if needed!

This means that our runtime can’t find the CUDA driver (libcuda.so). Assuming you have the CUDA driver installed, on WSL, the installation directory can vary.

Try setting the LD_LIBRARY_PATH in your environment to include “/usr/lib/wsl/lib” or where ever libcuda.so was installed.

Thanks! It worked!