Any advice on adjusting code for Maxwell when coming from Kepler

The GTX 780 Ti and the GTX 980 use identical memory technology, but since the width of the memory interface on the GTX 980 is only 2/3 that of the GTX 780 Ti memory interface, memory throughput is likewise only 2/3. Abstracting from this particular case: Generally speaking, over the past ten years the memory bandwidth of GPUs has increased more slowly than the FLOPS, so for use cases where there is a choice, leaning towards increased computation and reduced memory accesses is usually a good way of future-proofing one’s code.

[url]http://www.geforce.com/hardware/desktop-gpus/geforce-gtx-780-ti/specifications[/url]
Memory Interface Width 384-bit
Memory Bandwidth (GB/sec) 336

[url]http://www.geforce.com/hardware/desktop-gpus/geforce-gtx-980/specifications[/url]
Memory Interface Width 256-bit
Memory Bandwidth (GB/sec) 224