shared memory with structs no initialisation needed ? default constructor in struct with initalizati

I calculated in my kernel a maximum value and its index in an array using a shared memory variable.

Usually reading the shared memory variable before initializing results in wrong results, since we do not know which value is at the given address the variable points to.

Using a Struct as a shared memory variable and initializing it with its default constructor, does not lead to wrong results ? which makes initializing it with a single thread obsolet ?

Like this:

__global__ void getMax(float* inData_g)

{

	__shared__ float maxValue_s;

	__shared__ unsigned int maxIndex_s;

	//initialize shared memory

	if(threadIdx.x == 0)

		maxValue_s = 0.0f;

	float tempValue = 1.0f;

	unsigned int tempIndex = 5;

	

	//do some stuff here like parallel reduction loop

	if(threadIdx.x == 0)

	{

		/*

		 * we read maxValue_s here so it needs to be initialized before!

		 * else we would retrieve any value that currently is at that memory address

		 * and so get wrong results

		 */

		maxValue_s = __max(maxValue_s, tempValue);

		if(maxValue_s == tempValue)

			maxIndex_s = tempIndex;	

	}

}

Now I changed my code using a simple structure and without shared memory initializing

struct __align__(16) KeyValuePairIntFloat

{

	unsigned int key;

	float value;

	__device__ KeyValuePairIntFloat() 

	{

		this->key = 0; //not even necessary regarding test below

		this->value =  0.0f;  //not even necessary regarding test below

	}

};
__global__ void getMax(float* inData_g)

{

	__shared__ KeyValuePairIntFloat keyValuePairMaxIndexValue;

	float tempValue = 1.0f;

	unsigned int tempIndex = 5;

	

	//do some stuff here like parallel reduction loop

	if(threadIdx.x == 0)

	{

		/*

		 * we can use keyValuePairMaxIndexValue.value here!

		 */

		keyValuePairMaxIndexValue.value = __max(keyValuePairMaxIndexValue.value, tempValue);

		if(keyValuePairMaxIndexValue.value == tempValue)

			keyValuePairMaxIndexValue.key = tempIndex;	

	}

}

Output: “Maximum found: [index] = value”

Correct Output: “Maximum found: [2] = 10.0”

  1. Using single variables initialized with single thread: “Maximum found: [2] = 10.0”

  2. Using single variables being not initialized: “Maximum found: [1149771776] = 1117.0”

  3. Using struct initialized with default constructor: “Maximum found: [2] = 10.0”

  4. Using struct not initialized with default constructor: “Maximum found: [2] = 10.0”

  5. Using struct not initialized with default constructor but initialized with single thread (obsolet): “Maximum found: [2] = 10.0”

Why can a shared memory struct automatically be initialized while normal variables cannot ?

Bear with me if I misunderstood your question, but isn’t it just that your struct variables are being initialized in your struct’s constructor? So even though you’re not explicitly initializing them in your kernel code (or that is, you’re not manually setting the values), they are being initialized nonetheless in your constructor. That is to say, whenever you’re allocating memory for your struct, it’ll have the applicable constructor called in one form or another.

Yes the struct variables are being initialized in my struct’s constructor.

But, the variable is declared as shared, and directly declaring and initializing a shared variable in Cuda is not possible!

__shared__ float value = 1.0f;

As far as I know you have to manually initialize values.

Why then is it possible with Structures ?

__shared__ float value;

value = 1.0f; //now all threads in a block initialize the shared memory!

//And so the usual way to go is:

if(threadIdx.x == 0)

	value = 1.0f;

__syncthreads();