Nvfortran unable to find intrinsic bit manipulation functions in device code

Hi All,

I’ve been trying to construct z-order curves on the GPU using the mvbits function. However, every time I try to compile code with mvbits being used in a device or global subroutine/function, I get the error
NVFORTRAN-S-1253-Calls from device code to a host subroutine/function are not allowed - mvbits

A simple dummy case that won’t compile with the use of mvbits is

module kernel

    use cudafor

    integer, parameter:: nbits=10


    attributes(global) subroutine cell2hash(input, output)

        implicit none
        integer, intent(in):: input(:)
        integer, intent(out):: output(:)
        integer:: d, i

        i = blockDim%x*(blockIdx%x - 1) + threadIdx%x
        output(i) = 0
        call mvbits(input(i), 0, 1, output(i), 0)

    end subroutine cell2hash

end module kernel

program bittest

    use kernel

    implicit none
    integer, managed:: data(32), dataout(32)
    integer:: i

    do i = 1, 32
        data(i) = i
    end do

    call cell2hash<<<1, 32>>>(data, dataout)

end program bittest

The documentation says that mvbits should be available, so I’m not sure why this isn’t working.

I tested the above code with nvfortran 23.1 both with --gpu=cuda11.8 and --gpu=cuda12.0 and got the same results.

It does look like it is missing for device code for all of our Fortran GPU compilers. I will open a bug for this.

Thanks @bleback
I tested a few other bit manipulation subroutines, and they were missing as well. Not sure if I was unlucky, or all of them are missing.

I’ve tried btest, ibits, and ibset, and those seem to be there. What others did you find missing?

@bleback I tested them all again and they’re all there. I may have mistakenly tried to use them as subroutines rather than as functions. Only mvbits is missing.

We now support the mvbits subroutine in device code in our latest release, 23.5.

