Does static_cast can support half data type?

Does static_cast can support non cpp native data types in cuda code? such as half, bhalf.

What’s the difference between using static_cast and cuda Intrinsics, CUDA Math API :: CUDA Toolkit Documentation

Did you try it? it seems to work for me in CUDA 12.2 or newer. eg. casting float to half or vice-versa.

My guess would be (for those examples) that static_cast and cuda instrinsics would compile to approximately the same SASS code.

example:

# cat t265.cu
#include <cuda_fp16.h>
__global__ void k(float a, half *b){

#ifdef USE_INTRIN
  *b = __float2half(a);
#else
  *b = static_cast<half>(a);
#endif
}
# nvcc -arch=sm_70 -c t265.cu
# cuobjdump -sass ./t265.o

Fatbin elf code:
================
arch = sm_70
code version = [1,7]
host = linux
compile_size = 64bit

        code for sm_70
                Function : _Z1kfP6__half
        .headerflags    @"EF_CUDA_TEXMODE_UNIFIED EF_CUDA_64BIT_ADDRESS EF_CUDA_SM70 EF_CUDA_VIRTUAL_SM(EF_CUDA_SM70)"
        /*0000*/                   MOV R1, c[0x0][0x28] ;              /* 0x00000a0000017a02 */
                                                                       /* 0x000fc40000000f00 */
        /*0010*/              @!PT SHFL.IDX PT, RZ, RZ, RZ, RZ ;       /* 0x000000fffffff389 */
                                                                       /* 0x000fe200000e00ff */
        /*0020*/                   F2F.F16.F32 R5, c[0x0][0x160] ;     /* 0x0000580000057b04 */
                                                                       /* 0x000e220000200800 */
        /*0030*/                   MOV R2, c[0x0][0x168] ;             /* 0x00005a0000027a02 */
                                                                       /* 0x000fe40000000f00 */
        /*0040*/                   MOV R3, c[0x0][0x16c] ;             /* 0x00005b0000037a02 */
                                                                       /* 0x000fd00000000f00 */
        /*0050*/                   STG.E.U16.SYS [R2], R5 ;            /* 0x0000000502007386 */
                                                                       /* 0x001fe2000010e500 */
        /*0060*/                   EXIT ;                              /* 0x000000000000794d */
                                                                       /* 0x000fea0003800000 */
        /*0070*/                   BRA 0x70;                           /* 0xfffffff000007947 */
                                                                       /* 0x000fc0000383ffff */
                ..........



Fatbin ptx code:
================
arch = sm_70
code version = [8,2]
host = linux
compile_size = 64bit
compressed
# nvcc -arch=sm_70 -c t265.cu -DUSE_INTRIN
# cuobjdump -sass ./t265.o

Fatbin elf code:
================
arch = sm_70
code version = [1,7]
host = linux
compile_size = 64bit

        code for sm_70
                Function : _Z1kfP6__half
        .headerflags    @"EF_CUDA_TEXMODE_UNIFIED EF_CUDA_64BIT_ADDRESS EF_CUDA_SM70 EF_CUDA_VIRTUAL_SM(EF_CUDA_SM70)"
        /*0000*/                   MOV R1, c[0x0][0x28] ;              /* 0x00000a0000017a02 */
                                                                       /* 0x000fc40000000f00 */
        /*0010*/              @!PT SHFL.IDX PT, RZ, RZ, RZ, RZ ;       /* 0x000000fffffff389 */
                                                                       /* 0x000fe200000e00ff */
        /*0020*/                   F2F.F16.F32 R5, c[0x0][0x160] ;     /* 0x0000580000057b04 */
                                                                       /* 0x000e220000200800 */
        /*0030*/                   MOV R2, c[0x0][0x168] ;             /* 0x00005a0000027a02 */
                                                                       /* 0x000fe40000000f00 */
        /*0040*/                   MOV R3, c[0x0][0x16c] ;             /* 0x00005b0000037a02 */
                                                                       /* 0x000fd00000000f00 */
        /*0050*/                   STG.E.U16.SYS [R2], R5 ;            /* 0x0000000502007386 */
                                                                       /* 0x001fe2000010e500 */
        /*0060*/                   EXIT ;                              /* 0x000000000000794d */
                                                                       /* 0x000fea0003800000 */
        /*0070*/                   BRA 0x70;                           /* 0xfffffff000007947 */
                                                                       /* 0x000fc0000383ffff */
                ..........



Fatbin ptx code:
================
arch = sm_70
code version = [8,2]
host = linux
compile_size = 64bit
compressed
#

Note that this is not full constexpr evaluation. For that, this may be of interest as a roadmap.