I have a program using CUDA under Visual Studio 2005. The problem is the following. I have a kernel function computing some complex math. Its length is about 100 strings. Everything’s going fine until I add a definition of some variable. After this the kernel starts to work strange. There are no errors, programs still works and doesn’t crash. However there are no more output or results of computation from this kernel function. It looks like kernel stopped at the very beginning. It must be mentioned that addition of that variable in emulation mode doesn’t lead to some unexpected results.
It’s also strange that added variable is never used it’s just declarated and all! But with its declaration kernel doesn’t work properly anymore. The code file is attached.
Beginning of the kernel is:
global void FluidElem(CUDA_ARRAY* cuda)
const int i = blockIdx.x*blockDim.x + threadIdx.x;
if(i >= cuda->elem_size)
const FMatrix divu = 0;
const ANISO N = cur_quad->m_Nshear, A11 = cur_quad->H - N, A12 = cur_quad->C, A22 = cur_quad->M;
const FMatrix tau_div = divu*(A11 - N);
const FMatrix p = divuA12 + divuA22;
for(int k=0; k<=N_deg; k++)
for(int l=0; l<=N_deg; l++)
if(cur_quad->m_info(k,l).bd_type & (1<<POINTINFO:S))
…more different math computing
So this mystery variable is p. And if I comment
//const FMatrix p = divuA12 + divuA22;
everything will immediately work properly otherwise kernel does nothing. P is not used in the kernel, it’s just declared.
I have no idea what is the problem and how to figure out it.
System is Windows Server 2003 x64.
The videocard is Quadro FX 4600. Driver’s version 169.61.
Thanks in advance. Biot.txt (3.92 KB)
strange, because the compiler will likely remove the definition of p since it is not used afterwards in the kernel, so it should generate exactly the same code. generate ptx code for both cases ans see if there is any difference.