Hi,
I am running nvvp under xubuntu 12.04 with a GTX TITAN card and cuda 5 installed. I am currently follow the Udacity course and doing some Cuda program localy.
Using nvvp I have launch “Analyze All” in order to get all the analysis. Most of them are working fine execpt the memory analysis, I get silly results:
Global Load Memory 0%
Global Load Memory n/a
Local Memory Overhead 0%
DRAM Utilization 171937.5% !!
As far as the code is quit simple and use a lot of global memory:
extern shared unsigned int locBin;
int myId = blockDim.xblockIdx.xnb;
int bin,x;
locBin[threadIdx.x]=0;
__syncthreads();
for(int i=0;i<nb;++i) {
x=threadIdx.x+iblockDim.x+myId;
bin=(unsigned int)(numB/h_maxd_in);
atomicAdd(&(locBin[bin]),1);
}
__syncthreads();
atomicAdd(&(d_his[threadIdx.x]),locBin[threadIdx.x]);
The results are wrong. Is there a way to have the good behavior?
Many thanks in advance.
Pierre.
Note:
NVIDIA Driver version: 319.23
Operating System: Linux-x86_64