"unaligned memory accesses not supported"

Hi everyone.

I’m having trouble compiling a program I made. I’m so very close to getting it working, but it gives me:

Assertion failure at line 548 of …/…/be/cg/NVISA/exp_loadstore.cxx:

Compiler Error in file /tmp/tmpxft_00000640_00000000-1.i during Code_Expansion phase:

unaligned memory accesses not supported

When I try to ‘make’ it. I’ll paste my kernel code in:

__global__ void

compareHash(testValue* gTestValues, int* testValuesLoc, hash* gHashTable, int* hashLoc, int* hashSize, int* hashEnd)

{

	const unsigned int tid = threadIdx.x;

	const unsigned int bid = blockIdx.x;

	const unsigned int num_threads = blockDim.x;

	const unsigned int mem_size = HASH_MEM_SIZE / sizeof(unsigned long long);

	

	unsigned long long* sHashTable = (unsigned long long*)sharedMem;

	bool* found = (bool*)&sHashTable[mem_size];

	testValue* currentValue = (testValue*)&found[1];

	if (tid == 0)

	{

  currentValue->value = gTestValues[testValuesLoc[0]].value;

  currentValue->stopLoc = gTestValues[testValuesLoc[0]--].stopLoc;

	}

	__syncthreads();

	while (testValuesLoc[0] >= BLOCKS && hashEnd[0] <= hashSize[0])

	{

  // copy table from global to shared mem

  for (int x = tid; x < mem_size; x += num_threads)

  {

  	if (hashLoc[0] * mem_size + x > hashEnd[0]) sHashTable[x] = 0;

  	else sHashTable[x] = gHashTable[hashLoc[0] * mem_size + x].value;

  }  

  // run comparisons

  if (tid == 0) *found = false;

  __syncthreads();

  for (int x = tid; x < mem_size && !*found; x += num_threads)

  {

  	if (sHashTable[x] == currentValue->value)

  	{

    *found = true;

    gHashTable[hashLoc[0] * mem_size + x].count++;

    currentValue->stopLoc = gTestValues[testValuesLoc[0]].stopLoc;

    currentValue->value = gTestValues[testValuesLoc[0]--].value;

  	}    

  }

  __syncthreads();

  if (tid == 0 && *hashLoc == currentValue->stopLoc)

  {

  	*found = true;

  	gHashTable[hashEnd[0]].value = currentValue->value;

  	gHashTable[hashEnd[0]++].count = 0;

  	currentValue->stopLoc = hashLoc[0];

  	currentValue->value = gTestValues[testValuesLoc[0]--].value;

  }

  if (tid == 0 && bid == 0)

  {

  	hashLoc[0] += mem_size;

  	if (hashLoc[0] >= hashEnd[0]) hashLoc[0] = 0;

  }

  __syncthreads();

	}

}

Also, if I’m doing something stupid in the code, let me know. I’m pretty new at this whole thing.

Hmm I guess you can disregard that then. I changed the testValue from a shared to local and it compiled.