Struct/register questions

GrooveStar · August 16, 2011, 1:45am

Hi All

I have a couple of question about how the cuda handles instances of structs declared in kernels. Say I have the following struct:

typedef struct tagMyStruct

{

	float3  X;

	float   Y;

} MyStruct;

My questions are:

With the above struct is the following a “good” way to load the data into the struct. I’m not to worried about the fact i have to hard code the type of the texture in the reinterpret. What i’m concerned about is the compiler will put an extra copy (and registry usage) to first copy the value from the texture to a temporary float4 then copy that float4 to the supplied address.

MyStruct theStruct

*reinterpret_cast<float4 *>(&theStruct) = tex1Dfetch(myTexture, index);

If I then have a section of code that works as follows, will the compiler be able to “release” the registers used to store part X for reuse at the point where they are no longer needed.

bool bFlag = false;

//If this branch is not taken the struct is not used again

if(SomeFunc1(theStruct.X))

{

	//As of this point no mater what what branches are taken part X is no longer used

	//Perform some very long operation that uses so many registers it

	//spills to global memory. This operation will NOT use part X of 

	//theStruct but MAY use part Y (including modifying it)

	bFlag = SomeFunc2(theStruct.Y);

}

I’d really appreciate any help/thoughts anyone has. I’m pretty sure the ptx file probably holds the answer to all my questions, unfortunately I’ve yet to unlock it’s secrets.