Copy cl_float2 to constant memory, bug?

If I copy an array of cl_float2 values to constant memory in OpenCL, the y part is always zero if I use the Nvidia platform, but correct for the AMD platform. Is this a bug in Nvidias OpenCL platform?

// Host

cl_float2* filter_temp = (cl_float2*)malloc(FILTER_SIZE * FILTER_SIZE * sizeof(cl_float2));
cl_float2 test;
test.s[0] = 3.0f;
test.s[1] = 13.0f;

for (int xx = 0; xx < FILTER_SIZE; xx++)
{
	for (int yy = 0; yy < FILTER_SIZE; yy++)
	{
		filter_temp[xx + yy * FILTER_SIZE].s[0] = test.s[0];
		filter_temp[xx + yy * FILTER_SIZE].s[1] = test.s[1];
		}
	}

	clEnqueueWriteBuffer(commandQueue, c_Quadrature_Filter_1, CL_TRUE, 0, FILTER_SIZE * FILTER_SIZE * sizeof(cl_float2), filter_temp, 0, NULL, NULL);
	free(filter_temp);


// Device

Filter_Response_1[Calculate3DIndex(x,y,z,DATA_W,DATA_H)].x = c_Quadrature_Filter_1[0].y;

It also works for the Intel platform…