double precision for textures on CUDA/OpenCL supported?

hi everyone,

do you know if there is an extension for double precision floating point support on textures for CUDA and/or OpenCL?
I have some single precision code that uses textures and I would like to compare performance with its double precision counterpart

thanks

What I do to fake it is something like this:

You put your data in a a width x height format which you give to clCreateImage2D, along with the format

cl_int err;

   cl_image_format format = { CL_RGBA, CL_UNSIGNED_INT32 };

img = clCreateImage2D(clctx, 

                         CL_MEM_COPY_HOST_PTR | CL_MEM_READ_ONLY, 

                         &format,   

                         width, height,

                         0, 

                         data, 

                         &err);

In the kernel you can then read it with something like:

inline double2 readImageDouble(uint4 a)

{

    union {

        uint2 i[2];

        double2 d;

    } arst;

    arst.i[0] = a.lo;

    arst.i[1] = a.hi;

    return arst.d;

}

sampler_t sample = CLK_ADDRESS_NONE

                 | CLK_NORMALIZED_COORDS_FALSE

                 | CLK_FILTER_NEAREST;

double2 x = readImageDouble(read_imageui(img, sample, i));

Umm…

Isn’t this what as_typen was designed for?

Kernel:

double2 x = as_double2(read_imageui(img, sample, i));

This works as CL_RGBA = 4 components, CL_UNSIGNED_INT32 = 32 bits, so uint4 = 4 * 32 = 128 bits

A double = 64 bits, double2 = 2 * 64 = 128 bits so the requirement of as_typen that the input and output have the same bit size is met.

(Note I have not tested it on a GPU yet)

Also:

Would be nice to see them add the ability for image2df to return doubles, maybe an image2dd for images with type CL_DOUBLE? It would complement the existing support for CL_HALF_FLOAT.

Perhaps a suggestion for 1.2? (Where do we suggest for 1.2 bty?)