I’ve just installed cuda 3.0 and want to test a simple program. I called cublas_sgemm to multiply two 22 matrices(B=AA), which turned out that the routine did nothing. I mean the output matrix B is just the same as the input matrix A. It’s strange! Previously I tried cublas 2.3 on ubuntu 8.04, and this test passed through.
Then I wondered whether or not it caused from the version. Then I tested cuda 2.3 of RHEL 5.3. This time the program leaded to segmentation fault…
Anyone knows the reason. I’m going insane… thanks…
[codebox] program matrixmod
implicit none
integer M, N
parameter (M=2, N=2)
real*4 a(M,N),b(M,N),c(M,N)
integer i, j
do j = 1, N
do i = 1, M
a(i,j) = (i-1) * M + j
enddo
enddo
do j = 1, N
do i = 1, M
b(i,j) = (i-1) * M + j
enddo
enddo
call cublas_sgemm('N','N',2,2,2,1.0,
& a,2,a,2,0.0,b,2)
do j = 1, N
do i = 1, M
write(*,"(F7.0$)") b(i,j)
enddo
write (*,*) ""
enddo
write (*,*) ""
do j = 1, N
do i = 1, M
write(*,"(F7.0$)") a(i,j)
enddo
write (*,*) ""
enddo
stop
end[/codebox]