Unexpected kernel execution results on GTX470

Hi everyone!

I’m using CUDA 4 RC with FW270.35 driver under Ubuntu 10.04 x64 and got strange results on executing my matrix multiplication kernel. I’m using 2 GPU’s in my system - GTX470 and GTX260, so when I execute kernel first on GTX470 I have some strange results after execution - nan or other weird numbers, but executing same kernel with same data on GTX260 passes correctly. Could someone tell me what makes such a big difference ? Or is it known issue.

Thanks.

Are you sure the kernel executes at all - do you check for error codes returned?

I checked on different CUDA SDK’s: 3.2 and 4.0RC, different drivers: 260.19 and 270.35 and different versions on Ubuntu - 10.04 and 10.10. seems that the problem with “-arch=” and “-code” options - when they are set to “-arch=compute_20” and “-code=sm_20” values then everything’s fine, otherwise (e.g. in case of -arch=compute_13 -code=sm_13) I’ve got strange results of computation.

Yes, but do you check the return codes from the individual cuda function calls? If the kernel is aborting due to some problem (which will cause weird results because the output will not be fully written), one of the cuda function calls will return a value that is not cudaSuccess, and cudaGetErrorString() will let you convert that error code into a description of the problem.