ptx-isa cache operator to say "L1 only" making shared memory redundant

What about a cache operator for memory store/load instructions that says “L1 only?” Call it “.pr.” for “private cache line.”

That would make shared memory(copying) unnecessary, because it seems that the only advantage of shared memory is to keep memory store instructions from updating the L2 cache. Well… shared memory also gives you finer granularity than the huge L1 cache line(in Fermi) but I gather most applications may be quite happy with that.

Am I missing something??