profiling time for __C_mzero8

Hi,

I tried to do profiling for a Fortran code using pgprof. The profiling time on “__c_mzero8” takes too much with around 10% of total time

======== CPU profiling result (top down):
Time(%)      Time  Name
...
  9.34%  3.43819s  __c_mzero8
...
  0.52%  190.45ms  | __c_mzero8
...

according to the discussion in the other thread
https://www.pgroup.com/userforum/viewtopic.php?p=8915&sid=6a0d69ac87a9db2e16ce987401c24d28
the results with " __c_mzero" seem to be already optimized by the PGI compiler (-fast).

Can I get further optimization for this (“A=B”) using any other PGI flags ?

Thanks. /JG

Hi JG,

Without seeing your code I can’t say for sure, but most likely the compiler is implicitly replacing the implicit do loop when using the array syntax of “A=0” with the call to mzero which is a highly optimized routine used to set memory to zero. You can try using the implicit loops by adding “-Mnoidiom” so mzero wont be used, but typically mzero will the fastest way to zero out an array.

How big are the arrays that are being initialized to zero? Do they need to be initialized?

-Mat

Hi Mat,

Thanks for your quickly responses.

“How big are the arrays that are being initialized to zero? Do they need to be initialized?”

The typical size of the arrays with double precision is 64^3*(1-5). And it is not necessary to be initialized at least for every call. I will look though the code and try to reduce such initialization.

Thanks. /JG