Tesla K20Xm slower then Tesla M2050

Hi,

I got a matlab (2011) program and a compiled cuda program to compare on ‘Tesla M2050’ and ‘Tesla K20Xm’.
The K20Xm much slower.

On both the ECC is enabled.
I tried to compile the cuda code with –arch=sm_35 but I got the error: ‘Error using handleKernelArgs’

How I can check it? which parameters I should check in order to find the cause of this slowness?

Thanks
(I new in this area)

I tried to compile the cuda code with –arch=sm_35 but I got the error: ‘Error using handleKernelArgs’

You should get that fixed. Are you using a MEX to interface with matlab?

I dont familiar with Matlab.
How I can know if I’m using Mex to interface with Matlab?