Problem with matrix-vector multiplication

I have a function that calculates a matrix-vector multiplication.
It works well in some cases and in the others the results are wrong. It happens in the second and fifth time that I call the kernel function.

What is possible wrong?
Have someone ever had the same problem?

Thanks in advance.
imdct.cu (5.48 KB)