Hello friends !
In the algorithm of matrixMul of cuda version 3.1.x,
//kernel warmup
matrixMul<<<grid,threads>>>(d_C,d_A,d_B,uiWA,uiWB);
cudaThreadSynchronize();
//create start timer
shrLog(“Run kernels…\n\n”);
unsigned int timer=0;
cutilCheckError(cutCreateTimer(&timer));
cutilCheckError(cutStartTimer(timer));
//execute the kernel
int nInter = 30;
for(int j=0;j<nInter;j++)
{
matrixMul<<<grid,threads>>>(d_C,d_A,d_B,uiWA,uiWB);
}
So I want to know what the for loop means ,and what does the kernel warmup means .Why the height of matrix A is not include int the kernel arguments ?
Expect yout answers ,thank you !
zcloudz