Hi. I am having problems using the CUDA Visual Profiler and the Pgprof Profiler. To show what the problem is I compile a little test program (see below) as follows with the pgi 10.5 compiler on Windows 7, 64 bit and a Gtx 260:
pgfortran vecmult.f90 -Minfo=ccff -ta=nvidia -o vec
- CUDA Visual profiler. When I start the profiling process, I get the following message:
=== Start profiling for session ‘Session1’ ===
Start program ‘C:/Workspace/vektormult/vec.exe’ run #1 …
licensed libpgacc.dll not found, exiting
What could this mean? Is it because I only have a trial version of the PGI compiler?
- Pgprof Profiler
After typing: pgcollect vec
and: Pgprof –exe vec
Pgprof starts and I can see how much time was spent on the first (host) loop, anyways I can find no information about the data transfer time for the arrays or the time spent on the second loop (GPU). So there is no ‘accelerator region time’ or ‘accelerator kernel time’ row as shown here: Account Login | PGI. Why not?
The program I used reads as follows:
program vecmult
real,dimension(:),allocatable :: A,B,C
integer :: N,M
!M=2^24
M=16777216
allocate(A(M))
allocate(B(M))
allocate(C(M))
do N=1,M
A(N) = 3./real(N)
B(N) = 2./real(N)
end do
!$acc region copyout(C(1:M))
do N=1,M
C(N) = A(N)*B(N)
end do
!$acc end region
write(*,*) C(1)
end program