Why the multiplication of a matrix of the same size on p53 almost five times that of p52?

Why is the multiplication of a matrix of the same size on p53(nvidia T1000) almost five times that of p52(nvidai P1000)?How to solve this problem?