OpenCL, double precision, GeForce 8500 GT

a_goryachih · June 11, 2014, 6:35am

Hi!
I can’t use double in kernels for GeForce 8500 GT. If I use only float or compile and perform kernels on Intel CPU, problem doesh’t appear. I tried to specify ‘#pragma OPENCL EXTENSION cl_khr_fp64: enable’, but it didn’t help. For example, i have a kernel like this:

#pragma OPENCL EXTENSION cl_khr_fp64: enable

__kernel void DelaunayRadius( __global double *x, __global double *y,
__global double * t )
{
int tid;
tid = get_global_id(0);
t[tid] = x[tid] + y[tid];
}

And when I checked this code in OpenCLCodeChecker, I recived build log

ptxas application ptx input, line 42; : error : Instruction ‘ld’ requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 43; : error : Instruction ‘ld’ requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 44; : error : Instruction ‘add’ requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 46; : error : Instruction ‘st’ requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas : fatal error : Ptx assembly aborted due to errors
ptxas application ptx input, line 42; : warning : Double is not supported. Demoting to float

Can somebody give me any advise?
I have lastest version of drivers and CUDA SDK.

Gert-Jan · June 11, 2014, 6:46am

Your 8500 GT is compute capability 1.1 (see this list of GPUs), which does not support doubles. The only solution is to use floats, or to buy a new GPU.

a_goryachih · June 11, 2014, 7:56am

Thanks a lot!
And maybe I missed something, but does ‘compute capability’ have some matching with version of OpenCL? I mean can I use inforamtion about version of OpenCL to define where ‘double’ is supported?

NVD · June 11, 2014, 9:48am

“Instruction ‘ld’ requires SM 1.3 or higher”

GeForce 8/9 GPUs do no support FP64, it’s clearly shown in the error message and this is a hardware limitation.

Consider upgrading to Nvidia Maxwell GeForce GTX 750 Ti or GTX 750 which does support FP64.

seibert · June 11, 2014, 2:50pm

“Compute capability” is a term that NVIDIA uses to categorize the hardware features of their GPUs. The very first GeForce 8000 cards were compute capability 1.0 (from 2007!), and now the latest Maxwell cards are compute capability 5.0.

OpenCL versions change much more slowly, and (being a multi-vendor standard) have no connection to NVIDIA compute capability numbering. (Similarly, CUDA version numbering also has no connection to NVIDIA compute capability numbering.) Since OpenCL is designed to be compatible with a wide range of hardware, the latest OpenCL version will still work with older devices, so the OpenCL version number doesn’t tell you much.

I am not an OpenCL developer, but the API might have a way to query the features supported by the OpenCL device which would allow you to determine if double precision is supported.

Topic		Replies	Views
OpenCL support for double precision for 8800 GTS (G80) CUDA Programming and Performance	2	1377	March 2, 2014
Meaning of CL_DEVICE_SINGLE_FP_CONFIG in absence of cl_khr_select_fprounding_mode extension CUDA Programming and Performance	2	5012	May 27, 2011
OpenCL Double Precision Support using Nvidia 1.3 compute hardware CUDA Programming and Performance	8	29847	November 22, 2010
double precision in Kernel how to use double precision in Kernel CUDA Programming and Performance	4	2194	November 7, 2008
Double precision support CUDA Programming and Performance	5	4424	September 10, 2009
clBuildProgram returns CL_INVALID_BINARY for double data types on a GTX 480 CUDA Programming and Performance	4	1882	November 13, 2013
Double & OpenCL CUDA Programming and Performance	5	5427	March 25, 2010
Double Precision not working CUDA Programming and Performance	4	18112	July 22, 2011
Half precision reciprocals in OpenCL CUDA Programming and Performance	5	1360	May 17, 2023
No speedup from run 2 kernels concurrently on a gpu device of compute capability 2.0 CUDA Programming and Performance	4	1533	October 10, 2011

OpenCL, double precision, GeForce 8500 GT

Related topics