Strange memory error

Hi,

I am quite new to CUDA but till now, my programms did what they were supposed to do.
Now I changed one small part inside my .cu-file and suddenly sometimes it runs normally, sometimes it crashes MATLAB. I am running my programm on CUDA 7.0, Visual Studio 2013 and I am using MATLAB R2015b with the mexcuda-function.
To be able to debug my CUDA-code-parts, I use: mexcuda -g -G … .cpp-file … .cu-file

The memcheck error looks like that (can’t really read much out of it):

CUDA module loaded: 1df41a520 cuModuleLoadDataEx
CUDA module loaded: 1df41a620 C:/Users/ab40/Documents/IterativeRekonstruktion/IterativeRekonstruktion/QuellCodeCuda_Rekonstruktion_OSSART_rotate.cu

CUDA Memory Checker detected 1 threads caused an access violation:
Launch Parameters
CUcontext = 96319d60
CUstream = 976669f0
CUmodule = 1df41a620
CUfunction = 1dfadfda0
FunctionName = Z4sartPjS_S_PfS0_PiS1_S0_S_S0_S_S0_S0_S0_S0_S0_S0
GridId = 614
gridDim = {1,1,1}
blockDim = {424,1,1}
sharedSize = 256
Parameters:
nnz = 0x0000000800d60200 428
rowCount = 0x0000000800d60600 16
cooRows = 0x0000000800c60000 0
cooCol = 0x0000000800c60800 0
measuredValues = 0x0000000800d60000 104.77402
ergSumCol = 0x0000000800d60a00 2.0954802
colCount = 0x0000000800d60400 424
volume_alt = 0x0000000800c61800 8.0541649
volume_neu = 0x0000000800c62000 8.0541649
cooValues = 0x0000000800c61000 0.065449849
colIndex = 0x0000000800c62800 0
valIndex = 0x0000000800c63000 0.065449849
colIndexStepSize = 0x0000000800c63800 0
sumNNZforCol = 0x0000000800c64000 4
ergSumRow = 0x0000000800c64800 0.39269906
ergMult = 0x0000000800d60c00 6.2467547
faktor = 0x0000000800c65000 3.162863
Parameters (raw):
0x00d60200 0x00000008 0x00d60400 0x00000008
0x00d60600 0x00000008 0x00c61800 0x00000008
0x00c62000 0x00000008 0x00c60000 0x00000008
0x00c60800 0x00000008 0x00c61000 0x00000008
0x00c62800 0x00000008 0x00c63000 0x00000008
0x00c63800 0x00000008 0x00c64000 0x00000008
0x00d60000 0x00000008 0x00d60a00 0x00000008
0x00c64800 0x00000008 0x00d60c00 0x00000008
0x00c65000 0x00000008
GPU State:
Address Size Type Mem Block Thread blockIdx threadIdx PC Source

800c62eb0 4 adr ld g 0 422 {0,0,0} {422,0,0} Z14faktorRechnungPfPjS_S_S0_S_S0_S_S0_S+000560 c:\users\ab40\documents\iterativerekonstruktion\iterativerekonstruktion\quellcodecuda_rekonstruktion_ossart_rotate.cu:115

Summary of access violations:
c:\users\ab40\documents\iterativerekonstruktion\iterativerekonstruktion\quellcodecuda_rekonstruktion_ossart_rotate.cu(115): error MemoryChecker: #misaligned=0 #invalidAddress=2

Memory Checker detected 1 access violations.
error = access violation on load (global memory)
gridid = 614
blockIdx = {0,0,0}
threadIdx = {422,0,0}
address = 0x800c62eb0
accessSize = 4

What does this tell me? Any help is very welcome, if you need the .cu-code to tell me more, let me know, it’s quite long.

Best regards, coorhp

From your description, it appears that this small change introduced a bug: now there are out-of-bounds memory accesses. Since accoding to your description, your change was localized, it should be easy to determine what is wrong by inspection. You could also try compiling with -lineinfo which should allow the memory checker to pinpoint the source code line with the out-of-bounds access.

Thank you so much for your help!
I found the bug, quite simple and really stupid -

I was going through a loop and adding values to elements of an array. BUT I did not initialise the array with the value zero at the beginning of every loop so the values were too big when it finished…

Like I said, quite new to CUDA…

Once again, thanks a lot!