Section 9.4 of the latest version of the OpenCL 1.0 spec (10/6/09) states:
OpenCL 1.0 adds support for specifying the rounding mode for an instruction or group of
instructions in the program source as an optional extension. An application that wants to use this
feature will need to include the
#pragma OPENCL EXTENSION cl_khr_select_fprounding_mode : enable directive.
If the cl_khr_select_fprounding_mode extension is supported, the OpenCL implementation
must support all four rounding modes for single precision floating-point i.e. the
CL_DEVICE_SINGLE_FP_CONFIG described in table 4.3 must include
CL_FP_ROUND_TO_ZERO and CL_FP_ROUND_TO_INF. This is already the case for double
precision floating-point.
The appropriate rounding mode can be specified using the following pragma in the program
source.
#pragma OPENCL SELECT_ROUNDING_MODE rounding-mode
The rounding-mode-value can be one of the following values:
rte
rtz
rtp
rtn
…
What does it mean if your card (e.g. GTX 285) doesn’t support this extension, BUT
does have the support for all four rounding modes? I see all four specified for
CL_DEVICE_SINGLE_FP_CONFIG when I run oclDeviceQuery.
Currently my take on this is that I can’t control which rounding mode is used
(e.g. when I add two floats together). The only sense I can make out
of this rounding mode support is that it must have something to do with the
“convert_” operator (also discussed in the spec).
My attempts to select the rounding mode (despite the apparent lack of support
for this extension) did nothing.
In CUDA, with the same card, I have modified the SDK example VectorAdd to prove
to myself that I can do addition of floating point numbers with arbitrary
rounding mode using the device functions (e.g. __fadd_ru(), etc.) and get
an exact match for what I do on the CPU (using fesetround() defined in the ISO
C standard).
I guess I just need to buy a 2.x compute capability GPU to get the
cl_khr_select_fprounding extension so that I can do in OpenCL what I can
already do in CUDA.
Can anyone confirm any of these guesses?
Thanks.
Chris