Cl_khr_fp16 OpenCL support?


NV hardware have support for fp16 for a long time, but it is not listed in OpenCL.
When are you planning to add support for that ?

1 Like

Never !
Because they want developers to be tied with the NVidia Hardware + CUDA fetures thing, IN EXACTLY the same way Apple tied/ties people with their “eco” system.
(That’s why Apple went with the AMD graphics on their boxes - Apple literaly saw her own “dirty face” in the NVidia’s face and got disgusted )
The only fair player in the industry is AMD, hands down !!!
Your only hope is to go for any of the RDNA X implementation (because they natively support FP16 at the harware level).

1 Like

Dear sir,
NVIDIA claims that they now support the flag with the new driver as stated in section 2.8 of both

I am still encountering some problems.

Let’s see if it works for others.

I don’t have much expertise with OpenCL compared to CUDA. However, if you have questions about cl_khr_fp16 , a suggestion might be to check if the support is advertised by the driver (of course you should first enable that feature as indicated in the driver release notes.) If it isn’t, then I wouldn’t expect the extension functionality to work. If it is advertised, and you believe that some piece of functionality associated with that extension is not being handled by the NVIDIA OpenCL driver correctly, you could file a bug. You’ll be asked for a repro/test case, most likely. You can also ask OpenCL questions here, but the community here is more focused on CUDA. YMMV. (And, make sure you have a new enough driver for the release note update to apply.)

Dear Robert,
thank you very much for your kind comments.
I will try to file a bug report since all the consumer cards seem do not have FP16 flag. I am currently helping clblast to find different tuning results on different GPUs, and I have many Nvidia GPUs do not support this flag including RTX4090

Best wishes,

Apparently, cl_khr_fp16 is supported/enabled as of last year; see this driver release PDF.

Caveat: Haven’t tried that myself yet.

The support position has been clarified by the NVIDIA OpenCL development team here:

Note: While enabling cl_khr_fp16 pragma/feature macro allows some basic usage of half float data types including basic math operations (add, sub, mul, div) on half floats with the newer compiler, math built-in functions for half floats are currently not supported. Using the same may lead to clBuildProgram failing. NVIDIA OpenCL 3.0 drivers do not claim conformance for cl_khr_fp16. The device and platform queries (CL_DEVICE_EXTENSIONS / CL_DEVICE_EXTENSIONS_WITH_VERSION / CL_PLATFORM_EXTENSIONS / CL_PLATFORM_EXTENSIONS_WITH_VERSION) do not report cl_khr_fp16 as one of the supported extensions on NVIDIA OpenCL 3.0 drivers even with the compiler upgrade. Please consider this as an experimental feature without any functional, conformance or performance guarantees.