From what I gather reading the manual, a “mis ld” type of error from the cuda memory checker logs indicates access of memory that is not word aligned. Is this correct? For reference, here is a sample output of the error:
Summary of access violations:
================================================================================
Memory Checker detected 1 access violations.
error = misaligned load (global memory)
gridid = 281511
blockIdx = {2489,0,0}
threadIdx = {0,0,0}
address = 0x1312c3a548
accessSize = 8
================================================================================
CUDA Memory Checker detected 1 threads caused an access violation:
Launch Parameters
CUcontext = 02fcc720
CUstream = 02fc0170
CUmodule = 06fbc390
CUfunction = 070a2d20
FunctionName = _Z25cudaColumnMeanPowerKernelPfP7ComplexIfEjj
GridId = 375246
gridDim = {5880,1,1}
blockDim = {1,1,1}
sharedSize = 256
Parameters:
Parameters (raw):
0x20e00000 0x00000013 0x09e00000 0x00000013
0x00002000 0x000016f8
GPU State:
Address Size Type Mem Block Thread blockIdx threadIdx PC Source
---------------------------------------------------------------------------------------------------------------------------------------------
130e5d7a10 8 mis ld g 2634 0 {2634,0,0} {0,0,0} _Z25cudaColumnMeanPowerKernelPfP7ComplexIfEjj+0000d8
Furthermore, I assume turning on “line numbers” when compiling should show me the offending source line. Is this correct?
The offending code kernel is this:
__global__ void cudaColumnMeanPowerKernel(FloatData* __restrict__ pOutputVec,
ComplexData* __restrict__ pInputMat,
UInt32 numRows,
UInt32 numCols)
{
FloatData sum = 0.0;
const UInt32 col = blockIdx.x;
for (UInt32 k = 0; k < numRows; ++k)
{
const UInt32 kk = k * numCols + col;
sum += (pInputMat[kk].real * pInputMat[kk].real) + (pInputMat[kk].imag * pInputMat[kk].imag);
}
pOutputVec[col] = sum / numRows;
}
I am thinking the problem is with the access to ComplexData which is 2 32-bit floats lined up for a total of 8 bytes. Does this data structure need to be aligned at an 8byte boundary?