A bug in ptx code...

an incomplete code like below:

int width,height;

short2 var1;

short2 *var_array;

var1 = var_array[var1.x+width*h];

the execution result indicates that the last line of the above code is interpreted like below:

var1.x = var_array[var1.x+width*h].x;

var1.y = var_array[var1.x+width*h].y;

I rechecked the ptx code and found that there are two load instructions indeed like following ptx codes:

mul24.lo.s32 	%r88, %r3, %r4;

	add.s32 	%r89, %r88, %r16;

	mul.lo.u32 	%r90, %r89, 4;

	add.u32 	%r91, %r31, %r90;

	ld.global.s16 	%r16, [%r91+0];

	add.s32 	%r92, %r88, %r16;

	mul.lo.u32 	%r93, %r92, 4;

	add.u32 	%r94, %r31, %r93;

	ld.global.s16 	%r35, [%r94+2];

why does this happens? Right now I have to use a temporary variable to hold the index value ‘var1.x+width*h’.

It’s an annoying bug which I never take in consider while I have problem in the execution result until I exculde all other uncertainty codes by comparison with CPU codes step by step.

sorry for poor english. :">

That isn’t a bug at all. Structure assignment is a standardized part of C, but how it is done is completely implementation specific. You code falls into “undefined behaviour” territory and should not be expected to work.

That compiler behavior breaks coalescing and is quite unexpected IMHO

I’ve seen such things happen before when any variable in the computation for in the index was declared volatile. Using a non-volatile temporary variable to hold the index helped in my case.

Yes, compiler behaves like this with structures, need to write wrapper.

short2 is supposed to be a built-in vector type permitting coalescing. It’s not a user defined structure.

Compiler does it with uchar4 too for example. At least, sometimes.