Hi.
I am write a aplication for gpu.
At fermi, kepler, maxwell no problem, all work fine.
But at Pascal (GeForce 1070) aplication work to slow.
I use cuda.lib 8 for pascal series.
This my settings in ptx:
.version 5.0
.target sm_60
.address_size 64
What is the reason that in old card with cuda 7.5 everything works quickly, and in new card with cuda 8 the same is running slowly? After all, the PTX 5.0 have not new commands of memory or something else