Weird int bug Multiply/Add Int bug affecting WinXP 64 driver.

Hello,

Firstly I admit I am a little new to OpenCL, so I apologise if there is a simple error in my code.

I have a kernel that calls an auxiliary function, but I get different results depending on the code, for 2 apparently identical code samples.

Here is the device code:

uint GetSub2IndHack( uint Rd,  uint Cd,  uint Dd, uint rLocal, uint cLocal, uint dLocal)

{

    uint RdB = Rd + 2;

    uint CdB = Cd + 2;

    uint sliceStride = RdB*CdB;

    uint rowStride   = RdB;

    uint indexBegin = sliceStride + rowStride + 1;

#ifdef USE_MAD

    // This works OK

    uint ret = mad24(sliceStride, dLocal, rLocal) + mad24(rowStride,cLocal,indexBegin);

#else

    // This does not.

    uint ret = sliceStride*dLocal + rLocal + rowStride*cLocal + indexBegin;

#endif

    return ret;

}

__kernel void calculateDT_DX(

    uint Rd,

    uint Cd,

    uint Dd,

    int rOut, 

    __global float * debugInfo)

{

    size_t gc = get_global_id(0);

    size_t gd = get_global_id(1);

    uint centreInd = GetSub2IndHack(Rd,Cd,Dd, (uint)rOut,(uint)gc,(uint)gd);

    int kd = 10*(gc + gd*3);

    debugInfo[kd++] = (float)centreInd;

    debugInfo[kd++] = (float)Rd;

    debugInfo[kd++] = (float)Cd;

    debugInfo[kd++] = (float)Dd;

    debugInfo[kd++] = (float)rOut;

    debugInfo[kd++] = (float)gc;

    debugInfo[kd++] = (float)gd;

}

When invoked, the kernel is passed values Rd=3, Cd=3, Dd=3, and rOut=1 and is invoked as a 3x3 2D range kernel. The correct result for global id (0,0) should give a centreInd of 32 (5*5 + 5 + 1 + 1), but with a specific driver for WinXp 64 I get a value of 7. This index appears to be computed wrongly for all threads. All the other parameters recovered from the debugInfo buffer look OK. Using mad24 instructions instead the code works, but I do not see why the other version fails… I suspect a OpenCl-compiler bug?

It works using the latest driver for Windows 7. The details of the hardware it fails on are :

CL_DEVICE_VERSION = OpenCL 1.0 CUDA

CL_DRIVER_VERSION = 266.58

CL_DEVICE_NAME = Quadro NVS 160M

CL_DEVICE_VENDOR = NVIDIA Corporation

PLATFORM : OpenCL version = OpenCL 1.0 CUDA 3.2.1

This is the latest release driver for XP-64.

Thanks for you help, John