Hi, which modules are currently accelerated (openacc - cuda) in the current version of wrf 3.9.1 ?
I have succesfully compiled the code correctly (FCOPTIM = -Kieee -acc -ta=tesla -Mcuda -fastsse -Mvect=noaltcode -Msmartalloc -Mprefetch=distance:8 -Minfo=all -Mneginfo=all), but there is no trace in the log of parallelized loop…
Only:
36, Loop not vectorized/parallelized: contains call
write_outbuf:
72, Loop not vectorized/parallelized: too deeply nested
83, Copy in and copy out of rptr in call to ext_ncd_write_field
Loop not fused: function call before adjacent loop
Loop not vectorized: may not be beneficial
Generated vector simd code for the loop
Generated a prefetch instruction for the loop
100, Copy in and copy out of iptr in call to ext_ncd_write_field
Loop not fused: function call before adjacent loop
Loop not vectorized: may not be beneficial
119, Copy in and copy out of rptr in call to ext_gr1_write_field
Loop not fused: function call before adjacent loop
Loop not vectorized: may not be beneficial
Generated vector simd code for the loop
Generated a prefetch instruction for the loop
136, Copy in and copy out of iptr in call to ext_gr1_write_field
Loop not fused: function call before adjacent loop
Loop not vectorized: may not be beneficial
155, Copy in and copy out of rptr in call to ext_int_write_field
Loop not fused: function call before adjacent loop
Loop not vectorized: may not be beneficial
Generated vector simd code for the loop
Generated a prefetch instruction for the loop
172, Copy in and copy out of iptr in call to ext_int_write_field
Loop not fused: function call before adjacent loop
Loop not vectorized: may not be beneficial
stitch_outbuf_patches:
226, Loop not vectorized/parallelized: contains call
233, Loop not vectorized/parallelized: contains call
254, Memory set idiom, loop replaced by call to __c_mset4
255, Memory zero idiom, loop replaced by call to __c_mzero4
256, Loop unrolled 3 times (completely unrolled)
Loop not fused: different loop trip count
Loop unrolled 8 times
262, Loop unrolled 3 times (completely unrolled)
270, Loop not vectorized/parallelized: potential early exits
380, Loop not vectorized/parallelized: too deeply nested
384, Loop unrolled 3 times (completely unrolled)
399, Loop not vectorized/parallelized: too deeply nested
401, Loop unrolled 3 times (completely unrolled)
403, Conflict or overlap between rbuffer and outpatch_table%patchlist%rptr
Loop not fused: function call before adjacent loop
Loop not vectorized: data dependency
413, Loop not vectorized/parallelized: too deeply nested
415, Loop unrolled 3 times (completely unrolled)
417, Conflict or overlap between ibuffer and outpatch_table%patchlist%iptr
Loop not fused: function call before adjacent loop
Loop not vectorized: data dependency
merge_patches:
438, Loop not vectorized: data dependency
Loop unrolled 2 times
store_patch_in_outbuf:
475, Loop not vectorized/parallelized: potential early exits
512, Loop unrolled 3 times (completely unrolled)
513, Loop unrolled 3 times (completely unrolled)
519, Loop not fused: complex flow graph
521, Generated vector simd code for the loop
Generated a prefetch instruction for the loop
store_patch_in_outbuf_pnc:
576, Loop not vectorized/parallelized: potential early exits
609, Loop unrolled 3 times (completely unrolled)
610, Loop unrolled 3 times (completely unrolled)
623, Loop not fused: function call before adjacent loop
627, Loop unrolled 3 times (completely unrolled)
628, Loop unrolled 3 times (completely unrolled)
629, Loop unrolled 3 times (completely unrolled)
646, Loop unrolled 3 times (completely unrolled)
647, Loop unrolled 3 times (completely unrolled)
648, Loop unrolled 3 times (completely unrolled)
681, Loop not fused: complex flow graph
683, Generated vector simd code for the loop
Generated a prefetch instruction for the loop