I would use either textures or shared memory, but I don’t think I would use both together.
The big advantage with textures is that you get a read cache, so for data with relatively tight spatial locality, you can get a useful speed up over global memory alone without the need for read patterns which will coalesce. But there can also be cache misses, which adds and additional penalty and can make textures slower the global memory. On average, textures are usually faster than “naked” global memory loads. The fact you can also do filtering/interpolation for free at the same time can yield big performance wins, if you need it.
One the other hand, coalesced reads into shared memory are usually worthwhile when you need non-linear global memory reads which can be assembled block-wise into coalesced reads, and you need to re-use data more than once across several threads within the same block. Fully coalesced global memory loads are basically the fastest off chip memory access method there is. If you can use them, you probably ought to prefer them over textures (unless you can also exploit filtering).
Knowing which one is most suitable requires analysis of your global memory access patterns.