Need some help in debug the cuda program

Hi: i am a Novice to CUDA programming and met with some problems in coding a program. I developed FDTD algorithm with perfect matching layer, however, the resulted Ez field ended to be 0 matrix. and i have no idea why is that.
My basic approach is that, initialized the matrices in the host and stored them in the h_(matrices) and write them in the d_matrices(global memory) to do the iterations. I guessed some things must have gone wrong in the kernel, however, it is quite difficult to see what is going on in the kernel. shoot!
I really need some help pointing out where the problems lie. Thanks in advance. External Image
the source code in is attached.
cudaprogram.txt (11.5 KB)
cudaprogram.doc (63.5 KB)