I don’t have a problem but wanted to share an interesting result.
I am running an OpenACC Fortran code on a GeFroce GTX 970 (cc is 5.0)
When I compile with -ta=tesla:cc50 the code runs in 27.2 seconds consistently.
When I compile with -ta=tesla:cuda8.0 the code runs in only 22.3 seconds (an 18% improvement!) consistently.
Since my card is cc 5.0 why is the cuda8.0 flag so much faster (not that I am complaining!)?
Does the compiler not use cuda8.0 by default?
Is it safe to use ta=tesla:cuda8.0 for all GPU cards even if their cc is old?